Threshold Saturation via Spatial Coupling: Why 
Convolutional LDPC Ensembles Perform so well 

over the BEC 

Shrinivas Kudekar*, Tom Richardsonl^ and Riidiger Urbanke* 
* School of Computer and Communication Sciences 

EPFL, Lausanne, Switzerland 
Email: {shrinivas.kudekar, ruediger.urbanke}@epfl.ch 
t Qualcomm, USA 
Email: tjr@qualcomm.com 



Abstract — Convolutional LDPC ensembles, introduced by Fel- 
strom and Zigangirov, have excellent thresholds and these thresh- 
olds are rapidly increasing functions of the average degree. 
Several variations on the basic theme have been proposed to 
date, all of which share the good performance characteristics of 
convolutional LDPC ensembles. 

We describe the fundamental mechanism which explains why 
"convolutional-like" or "spatially coupled" codes perform so well. 
In essence, the spatial coupling of the individual code structure 
has the effect of increasing the belief-propagation threshold of 
the new ensemble to its maximum possible value, namely the 
maximum-a-posteriori threshold of the underlying ensemble. For 
this reason we call this phenomenon "threshold saturation". 

This gives an entirely new way of approaching capacity. One 
significant advantage of such a construction is that one can create 
capacity-approaching ensembles with an error correcting radius 
which is increasing in the blocklength. Our proof makes use 
of the area theorem of the belief-propagation EXIT curve and 
the connection between the maximum-a-posteriori and belief- 
propagation threshold recently pointed out by Measson, Monta- 
nari, Richardson, and Urbanke. 

Although we prove the connection between the maximum- 
a-posteriori and the belief-propagation threshold only for a 
very specific ensemble and only for the binary erasure channel, 
empirically a threshold saturation phenomenon occurs for a wide 
class of ensembles and channels. More generally, we conjecture 
that for a large range of graphical systems a similar saturation 
of the "dynamical" threshold occurs once individual components 
are coupled sufficiently strongly. This might give rise to improved 
algorithms as well as to new techniques for analysis. 



I. Introduction 

We consider the design of capacity-approaching codes based 
on the connection between the belief-propagation (BP) and 
maximum-a-posteriori (MAP) threshold of sparse graph codes. 
Recall that the BP threshold is the threshold of the "locally 
optimum" BP message-passing algorithm. As such it has low 
complexity. The MAP threshold, on the other hand, is the 
threshold of the "globally optimum" decoder No decoder can 
do better, but the complexity of the MAP decoder is in general 
high. The threshold itself is the unique channel parameter 
so that for channels with lower (better) parameter decoding 
succeeds with high probability (for large instances) whereas 
for channels with higher (worse) parameters decoding fails 
with high probability. Surprisingly, for sparse graph codes 



there is a connection between these two thresholds, see [1], 
[2]Q 

We discuss a fundamental mechanism which ensures that 
these two thresholds coincide (or at least are very close). 
We call this phenomenon "threshold saturation via spatial 
coupling." A prime example where this mechanism is at work 
are convolutional low-density parity-check (LDPC) ensembles. 

It was Tanner who introduced the method of "unwrapping" 
a cyclic block code into a convolutional structure [3], [4]. The 
first low-density convolutional ensembles were introduced by 
Felstrom and Zigangirov [5]. Convolutional LDPC ensembles 
are constructed by coupling several standard (l, r)-regular 
LDPC ensembles together in a chain. Perhaps surprisingly, 
due to the coupling, and assuming that the chain is finite and 
properly terminated, the threshold of the resulting ensemble 
is considerably improved. Indeed, if we start with a (3, 6)- 
regular ensemble, then on the binary erasure channel (BEC) 
the threshold is improved from e''''(l = 3, r = 6) w 0.4294 to 
roughly 0.4881 (the capacity for this case is ^). The latter 
number is the MAP threshold e"*''(l,r) of the underlying 
(3, 6)-regular ensemble. This opens up an entirely new way 
of constructing capacity-approaching ensembles. It is a folk 
theorem that for standard constructions improvements in the 
BP threshold go hand in hand with increases in the error floor 
More precisely, a large fraction of degree-two variable nodes 
is typically needed in order to get large thresholds under BP 
decoding. Unfortunately, the higher the fraction of degree-two 
variable nodes, the more low-weight codewords (small cycles, 
small stopping sets, ...) appear. Under MAP decoding on the 
other hand these two quantities are positively correlated. To 
be concrete, if we consider the sequence of (l, 2l)-regular 
ensembles of rate one-half, by increasing 1 we increase both 
the MAP threshold as well as the typical minimum distance. 
It is therefore possible to construct ensembles that have large 
MAP thresholds and low error floors. 

'There are some trivial instances in whicli the two thresholds coincide. 
This is e.g. the case for so-called "cycle ensembles" or, more generally, for 
irregular LDPC ensembles that have a large fraction of degree-two variable 
nodes. In these cases the reason for this agreement is that for both decoders 
the performance is dominated by small structures in the graph. But for 
general ensembles these two thresholds are distinct and, indeed, they can 
differ significantly. 
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The potential of convolutional LDPC codes has long been 
recognized. Our contribution lies therefore not in the intro- 
duction of a new coding scheme, but in clarifying the basic 
mechanism that make convolutional-like ensembles perform 
so well. 

There is a considerable literature on convolutional-like 
LDPC ensembles. Variations on the constructions as well as 
some analysis can be found in Engdahl and Zigangirov [6], En- 
gdahl, Lentmaier, and Zigangirov [7], Lentmaier, Truhachev, 
and Zigangirov [8], as well as Tanner, D. Sridhara, A. 
Sridharan, Fuja, and Costello [9]. In [10], [11], Sridharan, 
Lentmaier, Costello and Zigangirov consider density evolution 
(DE) for convolutional LDPC ensembles and determine thresh- 
olds for the BEC. The equivalent observations for general 
channels were reported by Lentmaier, Sridharan, Zigangirov 
and Costello in [11], [12]. The preceding two sets of works 
are perhaps the most pertinent to our setup. By considering the 
resulting thresholds and comparing them to the thresholds of 
the underlying ensembles under MAP decoding (see e.g. [13]) 
it becomes quickly apparent that an interesting physical effect 
must be at work. Indeed, in a recent paper [14], Lentmaier and 
Fettweis followed this route and independently formulated the 
equality of the BP threshold of convolutional LDPC ensembles 
and the MAP threshold of the underlying ensemble as a 
conjecture. They attribute this numerical observation to G. 
Liva. 

A representation of convolutional LDPC ensembles in terms 
of a protograph was introduced by Mitchell, Pusane, Zigan- 
girov and Costello [15]. The corresponding representation 
for terminated convolutional LDPC ensembles was introduced 
by Lentmaier, Fettweis, Zigangirov and Costello [16]. A 
pseudo-codeword analysis of convolutional LDPC codes was 
performed by Smarandache, Pusane, Vontobel, and Costello in 
[17], [18]. In [19], Papaleo, Iyengar, Siegel, Wolf, and Corazza 
consider windowed decoding of convolutional LDPC codes on 
the BEC to study the trade-off between the decoding latency 
and the code performance. 

In the sequel we will assume that the reader is familiar 
with basic notions of sparse graph codes and message-passing 
decoding, and in particular with the asymptotic analysis of 
LDPC ensembles for transmission over the binary erasure 
channel as it was accomplished in [20]. We summarized 
the most important facts which are needed for our proof in 



Section III-A but this summary is not meant to be a gentle 
introduction to the topic. Our notation follows for the most 
part the one in [13]. 

II. CONVOLUTIONAL-LiKE LDPC ENSEMBLES 

The principle that underlies the good performance of 
convolutional-like LDPC ensembles is very broad and there 
are many degrees of freedom in constructing such ensembles. 
In the sequel we introduce two basic variants. The (l,r,i)- 
ensemble is very close to the ensemble discussed in [16]. 
Experimentally it has a very good performance. We conjecture 
that it is capable of achieving capacity. 

We also introduce the ensemble (l, r, L, w). Experimentally 
it shows a worse trade-off between rate, threshold, and block- 
length. But it is easier to analyze and we will show that it 



is capacity achieving. One can think of w as a "smoothing 
parameter" and we investigate the behavior of this ensemble 
when w tends to infinity. 

A. The (l,r,i) Ensemble 

To start, consider a protograph of a standard (3, 6)-regular 
ensemble (see [21], [22] for the definition of protographs). It 
is shown in Figure [T] There are two variable nodes and there is 
one check node. Let M denote the number of variable nodes at 
each position. For our example, M — 100 means that we have 
50 copies of the protograph so that we have 100 variable nodes 
at each position. For all future discussions we will consider 
the regime where M tends to infinity. 



Fig. 1. Protograpli of a standard (3, 6) -regular ensemble. 

Next, consider a collection of {2L + 1) such protographs as 
shown in Figure |2] These protographs are non-interacting and 
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Fig. 2. A chain of (2L + 1) protographs of the standard (3, 6)-regular 
ensembles for L = 9. These protographs do not interact. 

SO each component behaves just like a standard (3, 6)-regular 
component. In particular, the beUef-propagation (BP) threshold 
of each protograph is just the standard threshold, call it e'"'(l — 
3, r = 6) (see Lemma|4]for an analytic characterization of this 
threshold). Slightly more generally: start with an (l,r — kl)- 
regular ensemble where 1 is odd so that 1 = (l — l)/2 G N. 

An interesting phenomenon occurs if we couple these com- 
ponents. To achieve this coupling, connect each protograph to 
]|^protographs "to the left" and to i protographs "to the right." 
This is shown in Figure [5] for the two cases (l = 3, r = 6) 
and (l = 7,r = 14). In this figure, i extra check nodes are 
added on each side to connect the "overhanging" edges at the 
boundary. 

There are two main effects resulting from this coupling: 
(i) Rate Reduction: Recall that the design rate of the un- 
derlying standard (l,r = fcl)-regular ensemble is 1 — 
i = Let us determine the design rate of the 

-If we think of this as a convolutional code, then 21 is the syndrome former 
memory of the code. 
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Fig. 3. Two coupled chains of protographs with L = 9 and (l = 3, r = 6) 
(top) and L = 7 and (l = 7, r = 14) (bottom), respectively. 



corresponding (l,r = kl,L) ensemble. By design rate 
we mean here the rate that we get if we assume that 
every involved check node imposes a linearly independent 
constraint. 

The variable nodes are indexed from —L to L so that 
in total there are {2L + 1)M variable nodes. The check 
nodes are indexed from — (L + 1) to (L + i), so that in 
total there are (2(L + i) + l)M/k check nodes. We see 
that, due to boundary effects, the design rate is reduced 
to 



R(1,T = kl,L) = 



(2£ + l)-(2(L + i) + l)/fc 

2L + 1 
k-1 21 



k 



fc(2L + 1)' 



where the first term on the right represents the design rate 
of the underlying standard (l,r — fcl)-regular ensemble 
and the second term represents the rate loss. As we see, 
this rate reduction effect vanishes at a speed 1/L. 
(ii) Threshold Increase: The threshold changes dramatically 
from e'""(l,r) to something close to e"*''(l,r) (the MAP 
threshold of the underlying standard (l, r)-regular en- 
semble; see Lemma|4|i. This phenomenon (which we call 
"threshold saturation") is much less intuitive and it is the 
aim of this paper to explain why this happens. 

So far we have considered (l, r — /cl)-regular ensembles. 
Let us now give a general definition of the (l, r, L) -ensemble 
which works for all parameters (l,r) so that 1 is odd. 
Rather than starting from a protograph, place variable nodes at 
positions [—L, L]. At each position there are M such variable 
nodes. Place ^AI check nodes at each position [— L — 1, L + i]. 
Connect exactly one of the 1 edges of each variable node at 
position i to a check node at position i — + 

Note that at each position i E [—L + i,L — i], there are 



exactly A/^r = Ml check node socket^ Exactly AI of those 
come from variable nodes at each position i — 1, . . . ,i + 1. 
For check nodes at the boundary the number of sockets is 
decreased linearly according to their position. The probability 
distribution of the ensemble is defined by choosing a random 
permutation on the set of all edges for each check node 
position. 

The next lemma, whose proof can be found in Appendix |l] 
asserts that the minimum stopping set distance of most codes 
in this ensemble is at least a fixed fraction of AI. With 
respect to the technique used in the proof we follow the 
lead of [15], [18] and [17], [22] which consider distance and 
pseudo-distance analysis of convolutional LDPC ensembles, 
respectively. 

Lemma I (Stopping Set Distance of (l, r, L)-Ensemble): 
Consider the (l, r, L) -ensemble with 1 = 21 + 1, 1> 1, and 
r > 1. Define 



E 



b{x) =-{l-l)h2{a{x)/T)+~^\og2{p{x))-a{x)^\og2{x), 

r r 

a;(a;) = a(a;)/r, h2{x) — ~x log2(a;) — (1 — x) log2(l — x). 

Let X denote the unique strictly positive solution of the 
equation b{x) — and let w(l,r) — ijj{x). Then, for any 

S>0, 



lim 

M— >-o 



PK,(C)/M< (1- 5)1^^(1, r)} = 0, 



where dss(C) denotes the minimum stopping set distance of 
the code C. 

Discussion: The quantity uj{l, r) is the relative weight (nor- 
malized to the blocklength) at which the exponent of the 
expected stopping set distribution of the underlying standard 
(l, r)-regular ensemble becomes positive. It is perhaps not too 
surprising that the same quantity also appears in our context. 
The lemma asserts that the minimum stopping set distance 
grows linearly in AI. But the stated bound does not scale with 
L. We leave it as an interesting open problem to determine 
whether this is due to the looseness of our bound or whether 
our bound indeed reflects the correct behavior 

Example 2 ({l = 3,r = 6, L)): An explicit calculation 
shows that x w 0.058 and 3w(3,6) w 0.056. Let 
n = M{2L + 1) be the blocklength. If we assume that 
2L + 1 = Af", a e (0,1), then M = . Lemma [l] 

asserts that the minimum stopping set distance grows in the 
blocklength at least as 0.056nT+^. 

B. The (1,T, Ljw) Ensemble 

In order to simplify the analysis we modify the ensemble 
(l, r. L) by adding a randomization of the edge connections. 

' Sockets are connection points where edges can be attached to a node. E.g., 
if a node has degree 3 then we imagine that it has 3 sockets. This terminology 
arises from the so-called configuration model of LDPC ensembles. In this 
model we imagine that we label all check-node sockets and all variable-node 
sockets with the set of integers from one to the cardinality of the sockets. To 
construct then a particular element of the ensemble we pick a permutation 
on this set uniformly at random from the set of all permutations and connect 
variable-node sockets to check-node sockets according to this permutation. 
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For the remainder of this paper we always assume that r > 1, 
so that the ensemble has a non-trivial design rate. 

We assume that the variable nodes are at positions [— L, L], 
i e N. At each position there are M variable nodes, M E 
N. Conceptually we think of the check nodes to be located 
at all integer positions from [—00,00]. Only some of these 
positions actually interact with the variable nodes. At each 
position there are check nodes. It remains to describe 
how the connections are chosen. 

Rather than assuming that a variable at position i has exactly 
one connection to a check node at position [i — 1, . . . ,i + 1], 
we assume that each of the 1 connections of a variable node 
at position i is uniformly and independently chosen from the 
range [i, . . . ,i + w — 1], where w is a "smoothing" parameter. 
In the same way, we assume that each of the r connections of 
a check node at position i is independently chosen from the 
range [i — w + 1, . . . We no longer require that 1 is odd. 

More precisely, the ensemble is defined as follows. Con- 
sider a variable node at position i. The variable node has 
1 outgoing edges. A type t is a w-tuple of non-negative 



.1), so that Y.J=o*j 



1. The 



integers, t = (to, ii, • ■ • , tt, 

operational meaning of t is that the variable node has tj 
edges which connect to a check node at position i+j. There 



are 



{l+w-l\ 



types. Assume that for each variable we order 
its edges in an arbitrary but fixed order. A constellation c is 
an 1-tuple, c = (ci, . . . , Ci) with elements in [0, w — 1]. Its 
operational significance is that if a variable node at position 
i has constellation c then its A;-th edge is connected to a 
check node at position i + Ck. Let t(c) denote the type of 
a constellation. Since we want the position of each edge to 
be chosen independently we impose a uniform distribution 
on the set of all constellations. This imposes the following 
distribution on the set of all types. We assign the probability 



p{t) = 



|{c:r(c) = OI 



Pick M so that Mp{t) is a natural number for all types t. 
For each position i pick Mp(t) variables which have their 
edges assigned according to type t. Further, use a random 
permutation for each variable, uniformly chosen from the 
set of all permutations on 1 letters, to map a type to a 
constellation. 

Under this assignment, and ignoring boundary effects, for 
each check position i, the number of edges that come from 
variables at position i — j, j e [0,w — 1], is M^. In other 
words, it is exactly a fraction — of the total number Ml 
of sockets at position i. At the check nodes, distribute these 
edges according to a permutation chosen uniformly at random 
from the set of all permutations on Ml letters, to the 
check nodes at this position. It is then not very difficult to see 
that, under this distribution, for each check node each edge 
is roughly independently chosen to be connected to one of 
its nearest w "left" neighbors. Here, "roughly independent" 
means that the corresponding probability deviates at most by a 
term of order 1/M from the desired distribution. As discussed 
beforehand, we will always consider the hmit in which M first 
tends to infinity and then the number of iterations tends to 
infinity. Therefore, for any fixed number of rounds of DE the 



probability model is exactly the independent model described 
above. 

Lemma 3 (Design Rate): The design rate of the ensemble 
(1, r, L, w), with w < 2L, is given by 

Proof: Let V be the number of variable nodes and C be 
the number of check nodes that are connected to at least one 
of these variable nodes. Recall that we define the design rate 
as 1 - C/V. 

There are ^ = M{2L + 1) variables in the graph. The 
check nodes that have potential connections to variable nodes 
in the range [—L,L] are indexed from —L to L + w — 1. 
Consider the check nodes at position —L. Each of the 
r edges of each such check node is chosen independently 
from the range [—L — w + l,—L]. The probability that such a 
check node has at least one connection in the range [—L, L] 
is equal to 1 — (^^^) . Therefore, the expected number of 
check nodes at position —L that are connected to the code is 
equal to Mi(l — (^^^) ). In a similar manner, the expected 
number of check nodes at position —L + i, i = 0, . . . ,w — 1, 
that are connected to the code is equal to M^{1— { ^~^~^ y)- 
All check nodes at positions —L + w, . . . , L — 1 are connected. 
Further, by symmetry, check nodes in the range L,. . . , L+w— 
1 have an identical contribution as check nodes in the range 
— L,...,— L + w— 1. Summing up all these contributions, we 
see that the number of check nodes which are connected is 
equal to 



C = M-\2L- 
r 



w ■ 



i=0 



Discussion: In the above lemma we have defined the design 
rate as the normalized difference of the number of variable 
nodes and the number of check nodes that are involved in the 
ensemble. This leads to a relatively simple expression which 
is suitable for our purposes. But in this ensemble there is a 
non-zero probability that there are two or more degree-one 
check nodes attached to the same variable node. In this case, 
some of these degree-one check nodes are redundant and do 
not impose constraints. This effect only happens for variable 
nodes close to the boundary. Since we consider the case where 
L tends to infinity, this slight difference between the "design 
rate" and the "true rate" does not play a role. We therefore opt 
for this simple definition. The design rate is a lower bound on 
the true rate. 

C. Other Variants 

There are many variations on the theme that show the same 
qualitative behavior. For real applications these and possibly 
other variations are vital to achieve the best trade-offs. Let us 
give a few select examples, 
(i) Diminished Rate Loss: One can start with a cycle (as 
is the case for tailbiting codes) rather than a chain so 
that some of the extra check nodes which we add at the 
boundary can be used for the termination on both sides. 
This reduces the rate-loss. 
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(ii) Irregular and Structured Ensembles: We can start with 
irregular or structured ensembles. Arrange a number of 
graphs next to each other in a horizontal order Couple 
them by connecting neighboring graphs up to some order 
Emperically, once the coupling is "strong" enough and 
spread out sufficiently, the threshold is "very close" to 
the MAP threshold of the underlying ensembles. See also 
[23] for a study of such ensembles. 

The main aim of this paper is to explain why coupled LDPC 
codes perform so well rather than optimizing the ensemble. 
Therefore, despite the practical importance of these variations, 
we focus on the ensemble {l,r, L,w). It is the simplest to 
analyze. 

III. General Principle 

As mentioned before, the basic reason why coupled ensem- 
bles have such good thresholds is that their BP threshold is 
very close to the MAP threshold of the underlying ensemble. 
Therefore, as a starting point, let us review how the BP and 
the MAP threshold of the underlying ensemble can be charac- 
terized. A detailed explanation of the following summary can 
be found in [13]. 

A. The Standard {l,r)-Regular Ensemble: BP versus MAP 

Consider density evolution (DE) of the standard (l, r)- 
regular ensemble. More precisely, consider the fixed point (FP) 
equation 



6(i-(i-xr^)^ 



(1) 



where e is the channel erasure value and x is the average 
erasure probabiUty flowing from the variable node side to the 
check node side. Both the BP as well as the MAP threshold 
of the (l, r)-regular ensemble can be characterized in terms 
of solutions (FPs) of this equation. 

Lemma 4 (Analytic Characterization of Thresholds): 
Consider the ( 1, r) -regular ensemble. Let e'"'(l,r) denote its 
BP threshold and let e"*'"(l,r) denote its MAP threshold. 
Define 



P 



%x) = ((1 - i)(r - 1) - i)(i - xy-^ - ^(1 - xY 



eix) 



^(1 + l(r - l)x - Tx) 



(l_(l_^)r-l)l- 



Let x^^ be the unique positive solution of the equation p^^{x) ~ 
and let x"*'' be the unique positive solution of the equation 
^"^■•(a;) = 0. Then e"''(l,r) = e(x'"') and e"*''(l,r) = e(x""). 
We remark that above, for ease of notation, we drop the 
dependence of x^^ and cc"*'' on 1 and r. 

Example 5 (Thresholds of {3,6)-Ensemble): Explicit com- 
putations show that e'"'(l = 3,r = 6) « 0.42944 and 
e""(l = 3,r = 6) w 0.488151. 

Lemma 6 (Graphical Characterization of Thresholds): 
The left-hand side of Figure]?] shows the so-called extended BP 
(EBP) EXIT curve associated to the (3, 6)-regular ensemble. 



This is the curve given by {e(a;),(l — (1 — xY^^Y}, 
< X < 1. For all regular ensembles with 1 > 3 this curve 
has a characteristic "C" shape. It starts at the point (1, 1) 
for X — 1 and then moves downwards until it "leaves" the 
unit box at the point (l,Xi,(l)) and extends to infinity. The 
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Fig. 4. Left: The EBP EXIT curve h'^'^^ of the (1 = 3, r = 6)-regular 
ensemble. The curve goes "outside the box" at the point (1, 2:11(1)) and tends 
to infinity. Right: The BP EXIT function h^^{e). Both the BP as well as the 
MAP threshold are determined by h^^{€). 

right-hand side of Figure ]4] shows the BP EXIT curve (dashed 
line). It is constructed from the EBP EXIT curve by "cutting 
off" the lower branch and by completing the upper branch 
via a vertical line. 

The BP threshold e'"'(l, r) is the point at which this vertical 
line hits the x-axis. In other words, the BP threshold e'"'(l,r) 
is equal to the smallest e-value which is taken on along the 
EBP EXIT curve. 

Lemma 7 (Lower Bound on x"''): For the (l, r)-regular en- 
semble 

a;''-(l,r) > 1 - (1- 1)"?^. 
Proof: Consider the polynomial p^^{x). Note that 
^■"■(2;) >Pix) = ((l-l)(r-l)-l)(l-x)=^-2_(3._2) forx e 
[0, 1]. Since ^'^''(0) > p{0) (l - 2)(r - 1) > 0, the positive 
root of p{x) is a lower bound on the positive root of p^^'i^x). 
But the positive root of p{x) is at 1 — ( ) • ^hi^ 
in turn is lower bounded by 1 — (l — 1)^^^. ■ 

To construct the MAP threshold e"*'"(l, r), integrate the BP 
EXIT curve starting at e = 1 until the area under this curve 
is equal to the design rate of the code. The point at which 
equality is achieved is the MAP threshold (see the right-hand 
side of Figure ]4]). 

Lemma 8 (MAP Threshold for Large Degrees): Consider 
the (1, r)-regular ensemble. Let r(l,r) = 1 — ^ denote 
the design rate so that r = Then, for r fixed and 

1 increasing, the MAP threshold e"*''(l,r) converges 
exponentially fast (in 1) to 1 — r. 

Proof: Recall that the MAP threshold is determined 
by the unique positive solution of the polynomial equation 
p"'^''{x) = 0, where p"'^''{x) is given in Lemma ]4] A closer 
look at this equation shows that this solution has the form 



= (l-r)(l- 



r 1- 



-^l + r-l) 



1 



^(1 
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2)) 



+ o(lri-'-)). 



We see that the root converges exponentially fast (in 1) to 
1 — r. Further, in terms of this root we can write the MAP 



6 



threshold as 



Lemma 9 (Stable and Unstable Fixed Points - [13]): 
Consider the standard (1, r)-regular ensemble with 1 > 3. 
Define 

/i(a;) = e(l - (1 - - a;. (2) 

Then, for e^^(l, r) < e < 1, there are exactly two strictly 
positive solutions of the equation h{x) = and they are both 
in the range [0,1]. 

Let Xs{e) be the larger of the two and let Xu(e) be the 
smaller of the two. Then Xs(e) is a strictly increasing function 
in e and a;u(e) is a strictly decreasing function in e. Finally, 
x.ien = x^ien- 

Discussion: Recall that h{x) represents the change of the 
erasure probability of DE in one iteration, assuming that the 
system has current erasure probability x. This change can be 
negative (erasure probability decreases), it can be positive, or 
it can be zero (i.e., there is a FP). We discuss some useful 
properties of h{x) in Appendix 

As the notation indicates, x^ corresponds to a stable FP 
whereas corresponds to an unstable FP. Here stability 
means that if we initialize DE with the value a;s(e) + S for 
a sufficiently small S then DE converges back to a;s(e). 

B. The (l,r,i) Ensemble 

Consider the EBP EXIT curve of the (l, r, L) ensemble. To 
compute this curve we proceed as follows. We fix a desired 
"entropy" value, see Definition [15] call it x- We initialize DE 
with the constant x- We then repeatedly perform one step 
of DE, where in each step we fix the channel parameter in 
such a way that the resulting entropy is equal to %. This is 
equivalent to the procedure introduced in [24, Section VIII] 
to compute the EBP EXIT curve for general binary-input 
memoryless output-symmetric channels. Once the procedure 
has converged, we plot its EXIT value versus the resulting 
channel parameter We then repeat the procedure for many 
different entropy values to produce a whole curve. 

Note that DE here is not just DE for the underlying 
ensemble. Due to the spatial structure we in effect deal with 
a multi-edge ensemble [25] with many edge types. For our 
current casual discussion the exact form of the DE equations 
is not important, but if you are curious please fast forward to 
Section [V] 

Why do we use this particular procedure? By using forward 
DE, one can only reach stable FPs. But the above procedure 
allows one to find points along the whole EBP EXIT curve, 
i.e., one can in particular also produce unstable FPs of DE. 

The resulting curve is shown in Figure |5] for various values 
of L. Note that these EBP EXIT curves show a dramatically 
different behavior compared to the EBP EXIT curve of the 
underlying ensemble. These curves appear to be "to the right" 
of the threshold e"*''(3,6) w 0.48815. For small values of L 
one might be led to believe that this is true since the design rate 
of such an ensemble is considerably smaller than 1 — l/r. But 




e 



Fig. 5. EBP EXIT curves of the ensemble (l = 3, r = 6, L) 
for L = 1,2,4,8,16,32,64, and 128. The BP/MAP thresholds 
are eB''/"AP(3,6,l) = 0.714309/0.820987, e'^^/"^^{3,6,2) = 
0.587842/0.668951, e'"'/"*''(3, 6, 4) = 0.512034/0.574158, 
^Bp/MAP(-3^Q^gj = 0.488757/0.527014, e'"'''"*''(3, 6, 16) = 
0.488151/0.505833, eB''/"*''(3, 6, 32) = 0.488151/0.496366, 
^BP/MAP(-3 g g4-) ^ 0.488151/0.492001, e'"'/"*''(3, 6, 128) = 
0.488151/0.489924. The light/dark gray areas mark the interior of 
the BP/MAP EXIT function of the underlying (3, 6)-regular ensemble, 
respectively. 

even for large values of L, where the rate of the ensemble 
is close to 1 — l/r, this dramatic increase in the threshold 
is still true. Emperically we see that, for L increasing, the 
EBP EXIT curve approaches the MAP EXIT curve of the 
underlying (l = 3, r = 6)-regular ensemble. In particular, for 
e ~ e"*''(l, r) the EBP EXIT curve drops essentially vertically 
until it hits zero. We will see that this is a fundamental property 
of this construction. 

C. Discussion 

A look at Figure |5] might convey the impression that the 
transition of the EBP EXIT function is completely flat and 
that the threshold of the ensemble (l, r, L) is exactly equal to 
the MAP threshold of the underlying (l, r)-regular ensemble 
when L tends to infinity. 

Unfortunately, the actual behavior is more subtle. Figure |6] 
shows the EBP EXIT curve for L = 32 with a small section 
of the transition greatly magnified. As one can see from this 
magnification, the curve is not flat but exhibits small "wiggles" 
in e around e"*''(l, r). These wiggles do not vanish as L tends 
to infinity but their width remains constant. As we will discuss 
in much more detail later, area considerations imply that, in 
the limit as L diverges to infinity, the BP threshold is slightly 
below e"*''(l,r). Although this does not play a role in the 
sequel, let us remark that the number of wiggles is (up to a 
small additive constant) equal to L. 

Where do these wiggles come from? They stem from the 
fact that the system is discrete. If, instead of considering 
a system with sections at integer points, we would deal 
with a continuous system where neighboring "sections" are 
infinitesimally close, then these wiggles would vanish. This 
"discretization" effect is well-known in the physics literature. 
By letting w tend to infinity we can in effect create a 
continuous system. This is in fact our main motivation for 
introducing this parameter. 

Emperically, these wiggles are very small (e.g., they are of 
width 10^^ for the (l = 3,r = 6,L) ensemble), and further. 
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these wiggles tend to when 1 is increased. Unfortunately 
this is hard to prove. 




Fig. 6. EBP EXIT curve for the (l = 3, r = 6, L = 32) ensemble. The 
circle show.s a magnified portion of the curve. The horizontal magnification 
is 10^, the vertical one is 1. 



We therefore study the ensemble {1,t, L,w). The wiggles 
for this ensemble are in fact larger, see e.g. Figure [7] But, as 
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e e 
Fig. 7. EBP EXIT curve for the (1 = 3, r = 6, L = 16, to) ensemble. Left: 
to = 2; The circle shows a magnified portion of the curve. The horizontal 
magnification is 10'^, the vertical one is 1. Right: w = 3; The circle shows 
a magnified portion of the curve. The horizontal magnification is 10^, the 
vertical one is 1. 

mentioned above, the wiggles can be made arbitrarily small 
by letting w (the smoothing parameter) tend to infinity. E.g., 
in the left-hand side of Figure |7] 2, whereas in the right- 
hand side we have w = 3. We see that the wiggle size has 
decreased by more than a factor of 10"^. 

IV. Main Statement and Interpretation 

As pointed out in the introduction, numerical experiments 
indicate that there is a large class of convolutional-like LDPC 
ensembles that all have the property that their BP threshold 
is "close" to the MAP threshold of the underlying ensemble. 
Unfortunately, no general theorem is known to date that states 
when this is the case. The following theorem gives a particular 
instance of what we believe to be a general principle. The 
bounds stated in the theorem are loose and can likely be 
improved considerably. Throughout the paper we assume that 
1 > 3. 



A. Main Statement 

Theorem 10 (BP Threshold of the {l,r, L,w) Ensemble): 
Consider transmission over the BEC(e) using random 



elements from the ensemble {l,r, L,w). Let e^^{l,T, L,w) 
denote the BP threshold and let R{1,t,L,w) denote the 
design rate of this ensemble. 

Then, in the limit as M tends to infinity, and for w > 

' (l-2-l/(r-2))16(l(l_l))8 J' 



max 



(l-2-l/(r-2 

e-(l,r,L,z«) r,L,«;)< 
e"*'"(l,r) + 



w — 1 



2L(l-(l-x"*''(l,r))^-i)i 



8lr- 



(1-4U) S)r 



(l-2-.)2 



(3) 



(4) 



In the limit as A/, L and w (in that order) tend to infinity, 

1 



'{l,r, L,w) 



lim lim i?(l,r,i,w) 
lim lim e' 

w^oo L—yoo 



Discussion 
(i) 



1 



(5) 



W 



lim lim e"^ 

oo L— J-oo 

MAP/ 



^(l,r). 



(6) 



The lower bound on e'"'(l, r, L, w) is the main result 
of this paper It shows that, up to a term which tends 
to zero when w tends to infinity, the threshold of the 
chain is equal to the MAP threshold of the underlying 
ensemble. The statement in the theorem is weak. As we 
discussed earlier, the convergence speed w.rt. w is most 
likely exponential. We prove only a convergence speed 
of w^5. We pose it as an open problem to improve this 
bound. We also remark that, as seen in (|6]l, the MAP 
threshold of the (l, r, L, w) ensemble tends to e"*''(l, r) 
for any finite w when L tends to infinity, whereas the BP 
threshold is bounded away from e""(l,r) for any finite 



(ii) We right away prove the upper bound on e'"'(l, r, i, w). 
For the purpose of our proof, we first consider a "circular" 
ensemble. This ensemble is defined in an identical man- 
ner as the (l, r, L, w) ensemble except that the positions 
are now from to K—l and index arithmetic is performed 
modulo K. This circular ensemble has design rate equal 
to 1 — l/r. Set K = 2L + w. The original ensemble is 
recovered by setting any consecutive w ~ I positions to 
zero. We first provide a lower bound on the conditional 
entropy for the circular ensemble when transmitting over 
a BEC with parameter e. We then show that setting w—1 
sections to 0, does not significantly decrease this entropy. 
Overall this gives an upper bound on the MAP threshold 
of the original ensemble. 

It is not hard to see that the BP EXIT curv^ is the 
same for both the (l, r)-regular ensemble and the cir- 
cular ensemble. Indeed, the forward DE (see Defini- 
tion 13 I converges to the same fixed-point for both en- 
, r)-regular ensemble and let e e 



sembles. Consider the (1, 
[e"""(l,r), 1]. The conditional entropy when transmitting 
over a BEC with parameter e is at least equal to 1 — l/r 

^The BP EXIT curve is the plot of the extrinsic estimate of the BP decoder 
versus the channel erasure fraction (see [13] for details). 
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minus the area under the BP EXIT curve between [e, 1] 
(see Theorem 3.120 in [13]). Call this area A{e). Here, 
the entropy is normalized by KM, where K is the length 
of the circular ensemble and M denotes the number of 
variable nodes per section. Assume now that we set w — 1 
consecutive sections of the circular ensemble to in order 
to recover the original ensemble. As a consequence, we 
"remove" an entropy (degrees of freedom) of at most 
{w — 1)/K from the circular system. The remaining 
entropy is therefore positive (and hence we are above 
the MAP threshold of the circular ensemble) as long as 
l-l/T-{w-l)/K-A{e) > 0. Thus the MAP threshold 
of the circular ensemble is given by the supremum over 
all e such that 1 - l/r - {w - 1)/K - A{e) < 0. Now 
note that A(e"*'"(l, r)) = 1 - l/r, so that the above 
condition becomes A(e"*''(l, r)) - A{e) < {w ~ l)/K. 
But the BP EXIT curve is an increasing function in 
e so that A(e"*''(l, r)) - A{e) > (e - e"*''(l, r))(l - 
(1 — x"*''(l,r))'^^^)-'-. We get the stated upper bound on 
e"*''(l, r, i, w) by lower bounding K by 2L. 

(iii) According to Lemma |3] 
limL^ooliniA/^oo^(l,r, L,w) = 1 - i. This 
immediately implies the limit (|5]l. The limit for 
the BP threshold e'""(l, r, L, w) follows from (j4|. 

(iv) According to Lemmajs] the MAP threshold e"'"'(l,r) of 
the underlying ensemble quickly approaches the Shannon 
limit. We therefore see that convolutional-like ensembles 
provide a way of approaching capacity with low complex- 
ity. E.g., for a rate equal to one-half, we get (^'■•^{1 — 
3,r = 6) = 0.48815, e"*^(l = 4,r = 8) = 0.49774, 
gMAP(]_ = = 10) = 0.499486, e"*''(l = 6,r = 12) = 
0.499876, e"*^(l = 7, r = 14) = 0.499969. 

B. Proof Outline 



The proof of the lower bound in Theorem 10 is long. 
We therefore break it up into several steps. Let us start by 
discussing each of the steps separately. This hopefully clarifies 
the main ideas. But it will also be useful later when we 
discuss how the main statement can potentially be generalized. 
We will see that some steps are quite generic, whereas other 
steps require a rather detailed analysis of the particular chosen 
system. 

(i) Existence of FP: "The" key to the proof is to show the 
existence of a unimodal FP {e*,x*) which takes on an 
essentially constant value in the "middle", has a fast 
"transition", and has arbitrarily small values towards the 
boundary (see Definition 12 1. Figure [8] shows a typical 



such example. We will see later that the associated 
channel parameter of such a FP, e*, is necessarily very 
close to l,r). 
(ii) Construction of EXIT Curve: Once we have established 
the existence of such a special FP we construct from 
it a whole FP family. The elements in this family of 
FPs look essentially identical. They differ only in their 
"width." This width changes continuously, initially being 
equal to roughly 2L + 1 until it reaches zero. As we will 
see, this family "explains" how the overall constellation 



-16 -14 -12 -10 -8 



-2 



10 12 14 16 



Fig. 8. Unimodal FP of the (l = 3, r = 6, L = 16, w = 3) ensemble with 
small values towards the boundaiy, a fast transition, and essentially constant 
values in the middle. 



(see Definition 12 1 collapses once the channel parameter 
has reached a value close to e""'(l,r): starting from the 
two boundaries, the whole constellation "moves in" like 
a wave until the two wave ends meet in the middle. 
The EBP EXIT curve is a projection of this wave (by 
computing the EXIT value of each member of the family). 
If we look at the EBP EXIT curve, this phenomenon 
corresponds to the very steep vertical transition close to 
e"*'"(l,r). 

Where do the wiggles in the EBP EXIT curve come 
from? Although the various FPs look "almost" identical 
(other than the place of the transition) they are not exactly 
identical. The e value changes very slightly (around e*). 
The larger we choose w the smaller we can make the 
changes (at the cost of a longer transition). 
When we construct the above family of FPs it is math- 
ematically convenient to allow the channel parameter e 
to depend on the position. Let us describe this in more 
detail. 

We start with a special FP as depicted in Figure |8] 
From this we construct a smooth family {e{a),x{a)), 
parameterized by a, a S [0, 1], where 3;(1) — 1 and 
where a;(0) — 0. The components of the vector e{a) are 
essentially constants (for a fixed). The possible excep- 
tions are components towards the boundary. We allow 
those components to take on larger (than in the middle) 
values. 

From the family {£{ct),x{a)) we derive an EBP EXIT 
curve and we then measure the area enclosed by this 
curve. We will see that this area is close to the design rate. 
From this we will be able to conclude that e* « e"*''(l, r). 

(iii) Operational Meaning of EXIT Curve: We next show 
that the EBP EXIT curve constructed in step (ii) has 
an operational meaning. More precisely, we show that 
if we pick a channel parameter sufficiently below e* then 
forward DE converges to the trivial FP. 

(iv) Putting it all Together: The final step is to combine all 
the constructions and bounds discussed in the previous 
steps to show that e^^{l,r,'w, L) converges to e"*''(l,r) 
when w and L tend to infinity. 



V. Proof OF Theorem [To] 

This section contains the technical details of Theorem [TO] 
We accomplish the proof by following the steps outlined in 
the previous section. To enhance the readability of this section 
we have moved some of the long proofs to the appendices. 
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A. Step (i): Existence of FP 

Definition 11 (Density Evolution of {1,t, L,w) Ensemble): 
Let Xi, i e Z, denote the average erasure probability which 
is emitted by variable nodes at position i. For i ^ [— i, L] we 
set Xi = 0. For i E [—L, L] the FP condition implied by DE 
is 



= e I 1 
If we define 



- V(i — V x.,+j^ky' 



J=0 



k=0 



(7) 



/.= 1 



_^ li?— 1 



r-1 



fc=0 



then (|7]i can be rewritten as 

\ '"^1 1—1 

In the sequel it will be handy to have an even shorter form 
for the right-hand side of (j7]i. Therefore, let 



only by the fact that every section is updated in infinitely 
many steps. We call such a schedule admissible. Again, we 
call a;*^^' the resulting sequence of constellations. 

In the sequel we will refer to this procedure as forward 
DE by which we mean the appropriate initialization and the 
subsequent DE procedure. E.g., in the next lemma we will 
discuss the FPs which are reached under forward DE. These 
FPs have special properties and so it will be convenient to 
be able to refer to them in a succinct way and to be able to 
distinguish them from general FPs of DE. 

Lemma 14 (FPs of Forward DE): Consider forward DE 
for the {l,r, L,w) ensemble. Let x^^^ denote the sequence 
(8) of constellations under an admissible schedule. Then a;'-^' 
converges to a FP of DE and this FP is independent of the 
schedule. In particular, it is equal to the FP of the parallel 
schedule. 

Proof: Consider first the parallel schedule. We claim that 
the vectors x^^^^ are ordered, i.e., x*^"-' > a;*-^^ > • • • > Q (the 
ordering is pointwise). This is true since a:^°^ — (1,...,1), 



w— 1 



g{Xi-w+l, ■ ■ ■ , Xi+w-l) = f 1 fi+j 



(9) 



Note that 



gix,...,x)^{l-il-xY-'f 



where the right-hand side represents DE for the underlying 
(l, r) -regular ensemble. 

The function /{{xi^i^+i, . . . , Xi) defined in ([sjl is decreasing 
in all its arguments e [0, 1], j = i — w + 1, . . . , i. In the 
sequel, it is understood that Xi G [0, 1]. The channel parameter 
e is allowed to take values in M+. 

Definition 12 (FPs of Density Evolution): Consider DE for 
the (l,r, Ljif) ensemble. Let x — {x^l, . . . ,xl). We call 
X the constellation. We say that x forms a FP of DE with 
parameter e if x fulfills (j7|i for i e [— L, L]. As a short hand 
we then say that (e,3;) is a FP. We say that (e,x) is a non- 
trivial FP if X is not identically zero. More generally, let 

e= (e-L, Co, •• 

where e e M+ for i e [—L, L]. We say that (e, x) forms a FP 
if 



Xi — eig{xi-w+i, ■ ■ ■ , Xi+w-i), i G [~L, L] 



(10) 



Definition 13 (Forward DE and Admissible Schedules): 
Consider DE for the {l,r, L,w) ensemble. More precisely, 
pick a parameter e e [0, 1]. Initialize x'^^-' = (1, . . . , 1). Let 
x^^") be the result of £ rounds of DE. I.e., x*^^'^^^ is generated 
from x^^^ by applying the DE equation ^ to each section 
i G [-L,L], 



We call this the parallel schedule. 

More generally, consider a schedule in which in each step 
£ an arbitrary subset of the sections is updated, constrained 



.(0) 



It now follows 



whereas a;^^-* < (e, . . . , e) < (1, . . . , 1) 
by induction on the number of iterations that the sequence a;*^^' 
is monotonically decreasing. 

Since the sequence x'^-' is also bounded from below it 
converges. Call the limit x^°°\ Since the DE equations are 
continuous it follows that a;*^""^ is a fixed point of DE ^ 
with parameter e. We call x^°°'> the forward FP of DE. 

That the limit (exists in general and that it) does not depend 
on the schedule follows by standard arguments and we will 
be brief. The idea is that for any two admissible schedules the 
corresponding computation trees are nested. This means that 
if we look at the computation graph of schedule let's say 1 
at time £ then there exists a time £' so that the computation 
graph under schedule 2 is a superset of the first computation 
graph. To be able to come to this conclusion we have crucially 
used the fact that for an admissible schedule every section 
is updated infinitely often. This shows that the performance 
under schedule 2 is at least as good as the performance under 
schedule L The converse claim, and hence equality, follows 
by symmetry. ■ 
Definition 15 (Entropy): Let a: be a constellation. We define 
the (normalized) entropy of x to be 



2L 



1 ^ 

— y 

+ 1 



i=-L 



Discussion: More precisely, we should call x{x) the average 
message entropy. But we will stick with the shorthand entropy 
in the sequel. 

Lemma 16 (Nontrivial FPs of Forward DE): Consider the 
ensemble (l,r,L, w). Let x_ be the FP of forward DE for 
the parameter e. For e G and x G [Oj (e ~ ^))> if 



L > 



2(f(e-xe~^)-l) 



(11) 



then x{x) > X- 

Proof: Let R{l,r,L,w) be the design rate of the 
(l,r,_L,it;) ensemble as stated in Lemma [3] Note that the 
design rate is a lower bound on the actual rate. It follows that 
the system has at least (2L + l)R{l,r, L,w)M degrees of 
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freedom. If we transmit over a channel with parameter e then 
in expectation at most (2L + 1)(1 — e)M of these degrees of 
freedom are resolved. Recall that we are considering the limit 
in which M diverges to infinity. Therefore we can work with 
averages and do not need to worry about the variation of the 
quantities under consideration. It follows that the number of 
degrees of freedom left unresolved, measured per position and 
normalized by M, is at least (i?(l, r, L,w) — 1 + e). 

Let X be the forward DE FP corresponding to parameter 
e. Recall that Xi is the average message which flows from a 
variable at position i towards the check nodes. From this we 
can compute the corresponding probability that the node value 
at position i has not been recovered. It is equal to e(^) = 

e 1-1 . Clearly, the BP decoder cannot be better than the 
MAP decoder Further, the MAP decoder cannot resolve the 
unknown degrees of freedom. It follows that we must have 



1 



2L + 1 



L 

E 



> R{l,T,L,w) - 1 + e. 



Note that Xi G [0, 1] so that Xi > x^ ^ . We conclude that 
1 ^ 

XU) = ^ X, > ei^ (i?(l, r, L, «;) - 1 + e). 



i=-L 



Assume that we want a constellation with entropy at least x- 
Using the expression for R{1,t,L,'w) from Lemma [s] this 
leads to the inequality 



r r 2L + 1 



(12) 



Solving for L and simplifying the inequality by upper bound- 
ing 1 — '2J27=o{wy ^ ^"'^ lower bounding 2L + 1 by 2L 
leads to ([TT). ■ 
Not all FPs can be constructed by forward DE. In particular, 
one can only reach (marginally) "stable" FPs by the above 



procedure. Recall from Section IV-B step (i), that we want to 
construct an unimodal FP which "explains" how the constel- 
lation collapses. Such a FP is by its very nature unstable. 

It is difficult to prove the existence of such a FP by direct 
methods. We therefore proceed in stages. We first show the 
existence of a "one-sided" increasing FP. We then construct 
the desired unimodal FP by taking two copies of the one-sided 
FP, flipping one copy, and gluing these FPs together 

Definition 17 (One-Sided Density Evolution): Consider the 
tuple X = {x-L, . . . ,Xf)). The FP condition implied by one- 
sided DE is equal to (|7]i with Xi = for i < ~L and Xi — xq 
for i > 0. 

Definition 18 (FPs of One-Sided DE): We say that a; is a 
one-sided FP (of DE) with parameter e and length i if (jTjl is 
fulfilled for i G [—L, 0], with Xi ~ for i < —L and Xi — xq 
for i > 0. 

In the same manner as we have done this for two-sided 
FPs, if e = (f-Li • • ■ , Co)' then we define one-sided FPs with 
respect to e. 

We say that x is non-decreasing if Xi ^ ^z+i for i — 
-L,...,0. 



Definition 19 (Entropy): Let a; be a one-sided 
define the (normalized) entropy of x to be 



FP. We 



L 



1 ° 

+1 



i=-L 



Definition 20 (Proper One-Sided FPs): Let (e, x) be a non- 
trivial and non-decreasing one-sided FP. As a short hand, we 
then say that (e, x) is a proper one-sided FP. 
A proper one-sided FP is shown in Figure |9] 

Definition 21 (One-Sided Forward DE and Schedules): 



Similar to Definition 13 one can define the one-sided forward 
DE by initializing all sections with 1 and by applying DE 
according to an admissible schedule. 

Lemma 22 (FPs of One-Sided Forward DE): Consider an 
(1, r, L, w) ensemble and let e G [0, 1]. Let x^°^ = (1, . . . , 1) 
and let a;*^^^ denote the result of applying £ steps of one- 
sided forward DE according to an admissible schedule (cf. 
Definition [2T|i. Then 

(i) x*^^^ converges to a limit which is a FP of one-sided DE. 
This limit is independent of the schedule and the limit is 
either proper or trivial. As a short hand we say that (e, x) 
is a one-sided FP of forward DE. 

(ii) For e G (^,1] and x e [0,ei^(e- i)), if L fulfills ^ 
then xisi) ^ X- 

Proof: The existence of the FP and the independence 
of the schedule follows along the same line as the equivalent 
statement for two-sided FPs in Lemma [14] We hence skip the 
details. Assume that this limit x^°°'> is non-trivial. We want 
to show that it is proper. This means we want to show that it 
is non-decreasing. We use induction. The initial constellation 
is non-decreasing. Let us now show that this property stays 
preserved in each step of DE if we apply a parallel schedule. 
More precisely, for any section i G [— L,0], 



eg{x. 



-w+l : ■ ■ ■ 

< ea(x^'^ 



v^'^ ) 



+ 1+W- 



-l) 



X 



i+1 



) and the 



where (a) follows from the monotonicity of g{. . 
induction hypothesis that x^^^f is non-decreasing. 

Let us now show that for e G ( ^ , 1] and % G [0, e 
if L fulfills (111 then x{^) > X- First, recall from Lemma 16 
that the corresponding two-sided FP of forward DE has 
entropy at least x under the stated conditions. Now compare 
one-sided and two-sided DE for the same initialization with the 
constant value 1 and the parallel schedule. We claim that for 
any step the values of the one-sided constellation at position 
i, i G [— L,0], are larger than or equal to the values of the 
two-sided constellation at the same position i. To see this we 
use induction. The claim is trivially true for the initialization. 
Assume therefore that the claim is true at a particular iteration 
£. For all points i G [— L, —w + 1] it is then trivially also true 
in iteration £+1, using the monotonicity of the DE map. For 
points i G [—w + 2,0], recall that the one sided DE "sees" 
the value xq for all positions Xi, i > 0, and that xa is the 
largest of all x-values. For the two-sided DE on the other 
hand, by symmetry, Xi = x-i < xq for all z > 0. Again by 
monotonicity, we see that the desired conclusion holds. 



11 



To conclude the proof: note that if for a unimodal two- 
sided constellation we compute the average over the positions 
[~L, 0] then we get at least as large a number as if we compute 
it over the whole length [—L,L]. This follows since the value 
at position is maximal. ■ 
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Fig. 9. A proper one-sided FP {e,x) for the ensemble (l = 3, r = 6, L = 
16,10 = 3), where e = 0.488151. As we will discuss in Lemma [23] for 
sufficiently large L, the maximum value of x, name ly x p, approaches the 
stable value Xs{e). Further, as discussed in Lemma [26| the width of the 
transition is of order 0(y), where 5 > is a parameter tnat indicates which 
elements of the constellation we want to include in the transition. 



Let us establish some basic properties of proper one-sided 
FPs. 

Lemma 23 (Maximum of FP): Let (e,a;), < e < 1, be a 
proper one-sided FP of length L. Then e > e'"'(l,r) and 

a;u(e) <xn< x^ie), 

where Xs{e) and Xu{e) denote the stable and unstable non-zero 
FP associated to e, respectively. 

Proof: We start by proving that e > e''''(l,r). Assume 
to the contrary that e < e''''(l, r). Then 



xo = eg{x- 



w+l, ■ 



, Xw^i) < egixo, ...,xo)< Xo, 



a contradiction. Here, the last step follows since e < e'"'(l,r) 
and < Xo < I- 

Let us now consider the claim that Xu{e) < xo < a;s(e)- 
The proof follows along a similar line of arguments. Since 
e'"'(l,r) < e < 1, both a;s(e) and a;u(e) exist and are strictly 
positive. Suppose that xo > Xs{e) or that xo < Xu{e)- Then 



Xo = £5(2;- 



w+l 7 



, x^-i) < eg{xQ, ...,xo) < xo, 



a contradiction. 

A slightly more careful analysis shows that e 7^ e""", so that 
in fact we have strict inequality, namely e > e''''(l, r). We skip 
the details. ■ 

Lemma 24 (Basic Bounds on FP): Let (e,a:) be a proper 
one-sided FP of length L. Then for all i e [— i, 0], 



(i) X, < e(l - (1 



liU Z — / 



Xi^j-k) ) : 



j,k=0 



(ii) Xi< e 



r — 1 x - 



7 , Xi+j^k 



j,k=0 

w—1 



j,k=0 



(iii) ^^>^[■^Y. 

(iv) Xi > 
e((l 



^i-\-j — k 



^ w—1 o 1 W—1 

1 V- V ^ ^^ 



W 



fc=0 



'^i+j — k 



■j,k=0 



Proof We have 

= e(l - - V (1 - - V x.,+j^k) 

j=0 fc=0 
\r-l 



r-1 



Letf(a;) = {l-xf-^, x e [0,1]. Since f{x) = (r-l)(r- 
2)(1 - xY^^ > 0, f{x) is convex. Let yj = ^ Y.k=o ^i+j-k- 
We have 

w—1 ^ w—1 ^ w—1 



in ^ — ^ ^ II) ^ — ^ ' in ^ — ^ 



W ^ — ' W 

j=0 k=0 



j=0 



Since f(a;) is convex, using Jensen's inequality, we obtain 

^ w—1 ^ w—1 

-E«%)^«-Ey^)' 

3=0 3=0 

which proves claim (i). 

The derivation of the remaining inequalities is based on the 
following identity: 

l-B''-^ = {l-B){l + B + --- + B''-^). (13) 

For < -B < 1 this gives rise to the following inequalities: 

1-5=^"^ > (r-l)B'^"2(l-B), (14) 
1 - 5=^-1 > (1 - B), (15) 
1-5=^-1 < (r-l)(l-B). (16) 

Let Bj^l - ^ Efc=o x^+j-k, so that 1 - f,+j = 1 - 



(recall the definition of /i+j from (|8]l). Using ( 15 1 this proves 
(iii): 



Xi = e\ — 



1 """^ 1—1 1 



■3=0 

<- E - E ^^+.-^-' 

j=0 k=0 



3=0 



If we use ( I61 instead then we get (ii). To prove (iv) we use 

w-l , , 



-^^<VE(i-^.)^J 

3=0 
W—1 ^ w~l 



w—1 



e( E ( ~ E ^^+3-k) (1 E ^^+3-k 

j=0 k=0 k=0 

Since x is increasing, Y.k=Q ^i+3-k < Y.k=Q ^i+^^-^-k- 
Hence, 



Xi > e 



It! — 1 



Xi^w—l — k 



k=0 



E 

j,k=0 



^i-\-j — k 



Lemma 25 (Spacing of FP): Let (e,x), e > 0, be a proper 
one-sided FP of length L. Then for i G [—L + 1,0], 



^i — 1 ^ ^ 



(l-l)(r-l)(f)-^ 



W— I 

E ^i+f") 



k=0 



12 



< e 



(l-l)(r-l)(f)-^ 



w 



Let Xi denote the weighted average Xi = ^ Z^^fe^o ^^i+j-fc- 
Then, for any j e [— oo,0], 

Xi - Xi-i < ^ 



1 



Xi+k < 

'■ — ' w 

k=0 



Proof: Represent both well as Xi-i in terms of the 
DE equation ( [T0| >. Taking the difference, 

Xi Xi—± 



1 1—1 1 

Apply the identity 

A™ - = (A - + + . . . 



(17) 



(18) 

where we set A = (l - ^Ej^oV^+j). ^ = ^ 
^ E7=o^ and m = 1 - 1. Note that A>B. Thus 

a-2 , ^1-3] 



(i) 



< (1- 

a.) (1 - 1)^^-' 



w 



(/i-l ^ /i+u)-l)- 



In step (i) we used the fact that A> B impHes A^^^ > A^B'i 
for all p, (7 e N so that p + q = 1 — 2. In step (ii) we made the 
substitution A- B ^ ^{fi-i - fi+w-i)- Since Xi = e^^"\ 

S \Ji-l — Ji+w-l)- 

€ W 

Consider the term (/i_i — fi+w-i)- Set = C^^^ and 
/i+«,-i = -D''"\ where C ^ (l - ^J2k=o ^i-^-kj and 

I? = (l - ^EkZo^^+ro^i-k)- Note that < < 1. 

Using again ( fTS] ), 

(/.-I 1) - ^ ^ . . . ^ 

< {r~l){C-D). 

Explicitly, 

{C - D) = — (y^(a;.t+it,-i-fc - Xi-i-k)) < — Xi+k, 

k=0 k=0 

which gives us the desired upper bound. By setting all Xi^k — 
1 we obtain the second, slightly weaker, form. 

To bound the spacing for the weighted averages we write 
Xi and Xi^i explicitly. 



Xi Xi— 



Xi-\-w—l •^i-\'W—2 ) 



+ 2{xi+w-2 - Xi+nj-s) H h w{xi - Xt-i) 

+ (w ~ l)(a;i_i - Xi-2) H h (xt-w+i - Xi-^j) 



< 



k=0 



Xi+k < 



The proof of the following lemma is long. Hence we 



relegate it to Appendix III 



Lemma 26 (Transition Length): Let w > 2l^r^. Let {e,x), 
e € (e""", 1], be a proper one-sided FP of length L. Then, for 
aU < (5 < 



2=l'iri3(l + 121r) ' 



\{i : 6 < Xi < Xs(e) - S}\ < 



c(l,r) 



where c(l,r) is a strictly positive constant independent of L 
and e. 

Let us now show how we can construct a large class of 
one-sided FPs which are not necessarily stable. In particular 
we will construct increasing FPs. The proof of the following 
theorem is relegated to Appendix |IV| 

Theorem 27 (Existence of One-Sided FPs): Fix the param- 
eters (1, r, w) and let a;u(l) < X- Let L > L{l, r, w, x), where 



"{rfl-i 



Alw 



•iw 



(l-i)(X-.Tu(l))'Ai*(l)(x-^u(l))2' 

8w w ^ 

A*(l)(x-a;„(l))(l-i)'f^/- 

There exists a proper one-sided FP x of length L that either 
has entropy x and channel parameter bounded by 

e«''(l,r) < e < 1, 

or has entropy bounded by 

(i-'-Kx-xuii)) Iw 



2r(L + 1) 



< X{x) < X 



and channel parameter e = 1. 

Discussion: We will soon see that, for the range of parameters 
of interest, the second alternative is not possible either. In the 
light of this, the previous theorem asserts for this range of 
parameters the existence of a proper FP of entropy x- In what 
follows, this FP will be the key ingredient to construct the 
whole EXIT curve. 



B. Step ( ii): Construction of EXIT Curve 

Definition 28 (EXIT Curve for (1, r, L, w)-Ensemble): Let 
(e*,a;*), < e* < 1, denote a proper one-sided FP of length 
L' and entropy x- I < L < L' . 

The interpolated family of constellations based on (e* , x* ) 
is denoted by {e{a),x{a)}^^Q. It is indexed from —L to L. 

This family is constructed from the one-sided FP {e*,x*). 
By definition, each element x{a) is symmetric. Hence, it 
suffices to define the constellations in the range [—L, 0] and 
then to set 2:^(0;) = x-i{a) for i £ [0,L]. As usual, we set 



13 



Xi{a) — for 
define 



[-L,L]. For i e [-L,0] and a e [0,1] 



'(4a-3) + (4-4a)xS, aG[f,l], 
{4a-2)x*-{4a-3)x*, a e [i f ), 



iax 



i-L'+L^ 



ae (0, i], 



a;i(a) 



where for a e (j, 



a(z, a) — X 



,4(L'-L)(i-a) mod (1) * 1-4(L'-L)( i -a) mod (1) 



»-r4(i-a)(L'-L)l 



^-\i{\-a)(L'-L)^+l 



The constellations a;(Q:) are increasing (component-wise) as 
a function of a, with x(q! = 0) = (0, . . . , 0) and with x{a = 
1) = (1,...,1). 

Remark: Let us clarify the notation occurring in the definition 
of the term a{i,a) above. The expression for a{i,a) consists 
of the product of two consecutive sections of a;*, indexed 
by the subscripts i - ~ a){L' - L)] and i - \'^{\ ~ 
a){L' — L)] + 1. The erasure values at the two sections are 
first raised to the powers A{L' — — a) mod (1) and 

1 — 4(L' — — a) mod (1), before taking their product. 
Here, mod (1) represents real numbers in the interval [0, 1]. 
Discussion: The interpolation is split into 4 phases. For a e 
the constellations decrease from the constant value 1 
to the constant value x^. For the range a E |], the 
constellation decreases further, mainly towards the boundaries, 
so that at the end of the interval it has reached the value 
X* at position i (hence, it stays constant at position 0). The 
third phase is the most interesting one. For a S ^] we 
"move in" the constellation x* by "taking out" sections in 
the middle and interpolating between two consecutive points. 
In particular, the value a{i, a) is the result of "interpolating" 
between two consecutive x* values, call them x* and x*_^_^, 
where the interpolation is done in the exponents, i.e., the value 



is of the form x*j ■ x* j'^'^. Finally, in the last phase all values 
are interpolated in a linear fashion until they have reached 0. 

Example 29 (EXIT Curve for (3, 6, 6, 2)-Ensemble): 
Figure 



10 



shows a small example which illustrates this 
interpolation for the (l = 3, r = 6, i = 6, w = 2)-ensemble. 
We start with a FP of entropy x = 0.2 for L' = 12. This 
consteUation has e* = 0.488223 and 

X* = (0,0,0,0,0,0.015, 

0.131, 0.319, 0.408, 0.428, 0.431, 0.432, 0.432). 

Note that, even though the constellation is quite short, e* is 
close to £"^'■(1 = 3,r = 6) w 0.48815, and x*q is close 
to a;s(e"*'') « 0.4323. From {e*,x*) we create an EXIT 
curve for L = 6. The figure shows 3 particular points of the 
interpolation, one in each of the first 3 phases. 

Consider, e.g., the top figure corresponding to phase (i). The 
constellation x in this case is completely flat. Correspondingly, 
the local channel values are also constant, except at the left 
boundary, where they are slightly higher to compensate for the 
"missing" x-values on the left. 



11111 
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Fig. 10. Construction of EXIT curve for (3, 6, 6, 2)-ensemble. Tlie figure 
.sliows three particular points in the interpolation, namely the points a = 
0.781 (phase (i)), a = 0.61 (phase (ii)), and a = 0.4 (phase (iii)). For each 
parameter both the constellation x as well as the local channel parameters 
e are shown in the figure on left. The right column of the figure illustrates 
a projection of the EXIT curve. I.e., we plot the average EXIT value of the 
constellation versus the channel value of the 0th section. For reference, also 
the EBP EXIT curve of the underlying (3, 6)-regular ensemble is shown (gray 
line). 



The second figure from the top shows a point corresponding 
to phase (ii). As we can see, the a; -values close to have not 
changed, but the a; -values close to the left boundary decrease 
towards the solution x* . Finally, the last figure shows a point in 
phase (iii). The constellation now "moves in." In this phase, the 
e values are close to e*, with the possible exception of e values 
close to the right boundary (of the one-sided constellation). 
These values can become large. 

The proof of the following theorem can be found in Ap- 
pendix [y] 

Theorem 30 (Fundamental Properties of EXIT Curve): 
Consider the pai-ameters (l,r,w). Let (e*,x*), e* e (e''^ 1], 
denote a proper one-sided FP of length L' and entropy x > 0. 



Then for 1 < L < L', the EXIT curve of Definition |28| has 
the following properties: 

(i) Continuity: The curve {£{a) , x{a)}]^^Q is continuous for 
a G [0, 1] and differentiable for a = [0, 1] except for a 
finite set of points. 
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(ii) Bounds in Phase (i): For a G [|, 1], 

£0(0;), i e [—L + w — 1,0], 



>eo(a), ie[-L,0]. 



(iii) Bounds in Phase (ii): For a e [5, |] and i e [— i, 0], 



e,(a) > e(a;5)- 



where e{x) 



(l_n_a;)r-l)l-l- 

(iv) Bounds in Phase (iii): Let 

^ ^^ (r-l)(l-l)(6*)A(l + z^V«) ^,_,^ 



(19) 



Let a e [j, |]. For > 7, 



ei(a) 



< e* 1 



„i/s); 



For Xi{a) < 7 and w > niax{2''l^r^, 2^^}, 

/ 4 x(r-2)(l-l) 

e.(a)>6*(l--^) ,ze[-i,0]. 

(v) Area under EXIT Curve: The EXIT value at position i E 
[—L,L] is defined by 



Let 



Jo + 

denote the area of the EXIT integral. Then 

\A{l,r,w,L)-{l--)\ < jlr. 

(vi) Bound on e* : For w > mayi{2^1^r'^ ,2^^}, 

2lT\xt, - x,{e*)\ + c(l,r,w,L) 



|e"*''(l,r)-e*| < 

where 

c(l,r, w, L) = 41ri/7~ 



u)l(2 + r) 



L 



2rl^ 



(l-4u;-5)^ 



C. Step ( iii): Operational Meaning of EXIT Curve 
Lemma 31 (Stability of {[t{a),x{a))}]^^Q): Let 
{{^{c') 1 ^{<^))}\i=o denote the EXIT curve constructed 
in Definition |28] For (3 e (0, 1), let 

e^*^) = inf {e,{a) : i G [-L,L]}. 

I3<a<l 



Consider forward DE (cf . Definition 1 3 1 with parameter e, 
e < e^^\ Then the sequence x^^^ (indexed from —L to L) 
converges to a FP which is point-wise upper bounded by x{f3). 

Proof: Recall from Lemma 14 that the sequence a;^^) 
converges to a FP of DE, call it xf^. We claim that < 
x(/3). 



We proceed by contradiction. Assume that is not 

point-wise dominated by x{l3). Recall that by construction of 
x{a) the components are decreasing in a and that they are 
continuous. Further, a;'-""-' < e < x{l). Therefore, 

7= inf <3:(a)} 

p<a<l 

is well defined. By assumption 7 > /3. Note that there must 
exist at least one position i G [—L, 0] so that ^^(7) = a;|°°|^ 
But since e < 6^(7) and since (]{■■■) is monotone in its 
components. 



x^(^) = ei(7)g(xi_„+i(7) 
(00) 



> eg(xl 



(00) 



Xi+w-lil)) 
(00) 



1) 



X 



a contradiction. 



i e [-L + w - 1,-w + 1], 



D. Step (iv): Putting it all Together 

We have now all the necessary ingredients to prove The- 
10 In fact, the only statement that needs proof is Q. 



orem 



First note that e'"'(l, r, L, w) is a non-increasing function in 
L. This follows by comparing DE for two constellations, one, 
say, of length Li and one of length L2, L2 > Li. It therefore 
suffices to prove Q for the limit of L tending to infinity. 
Let (l,r,u)) be fixed with w > w{l,r), where 

, .... . (21^(1 ' 

w(l, r) = max < 2 



>16 



2W 



l_2-V(r-2) 



' (l-^V(r-2))16(l(l_l))8/- 

Our strategy is as follows. We pick L' (length of constellation) 
sufficiently large (we will soon see what "sufficiently" means) 
and choose an entropy, call it x- Then we apply Theorem 27 
Throughout this section, we will use x* and e* to denote the 
FP and the corresponding channel parameter guaranteed by 



Theorem 27 We are faced with two possible scenarios. Either 
there exists a FP with the desired properties or there exists a 
FP with parameter e* = 1 and entropy at most x- We will 



then show (using Theorem 30 1 that for sufficiently large L' the 



second alternative is not possible. As a consequence, we will 
have shown the existence of a FP with the desired properties. 



Using again Theorem 30 we then show that e* is close to 



gMAP ^jjj j-jj^j. ^ lower bound for the BP threshold of the 
coupled code ensemble. 

Let us make this program precise. Pick x = ^"'^'^^+^ — (i£) 
and L' "large". In many of the subsequent steps we require 
specific lower bounds on L'. Our final choice is one which 



obeys all these lower bounds. Apply Theorem 27 with param- 
eters L' and We are faced with two alternatives. 

Consider first the possibility that the constructed one-sided 
FP X* has parameter e* = 1 and entropy bounded by 

{1 - l){x-' ~ x,{l)) 



Iw 



16 2r(i' + 1) 

For sufficiently large L' this can be simplified to 

-Xu{l) 



32 



-Xuil) 



(20) 



^It is not hard to show that under forward DE, the constellation 
is unimodal and symmetric around 0. This immediately follows from an 
inductive argument using Definition 1 131 
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Let us now construct an EXIT curve based on (e*,a;*) for a 
system of length L, 1 < L < L' . According to Theorem [30] 
it must be true that 

e* < e"*''(l r) + - a:s(£*)| + c(l, r, w, L) 

We claim that by choosing L' sufficiently large and by 
choosing L appropriately we can guarantee that 



a;.s(e*)| < S, \xl - x*_l\ < 6, x*_l,^l < S, 



(22) 



where 6 is any strictly positive number. If we assume this claim 
for a moment, then we see that the right-hand-side of (21 1 can 



be made strictly less than 1. Indeed, this follows from w > 
w(l,r) (hypothesis of the theorem) by choosing S sufficiently 
small (by making L' large enough) and by choosing L to be 
proportional to L' (we will see how this is done in the sequel). 
This is a contradiction, since by assumption e* = 1. This will 
show that the second alternative must apply. 



Let us now prove the bounds in (22i. In the sequel we 



say that sections with values in the interval [0, S] are part of 
the tail, that sections with values in [5,Xs{e*) — S] form the 
transition, and that sections with values in [xs{e*) — S,Xs{e*)] 
represent the flat part. Recall from Definition [15] that the 
entropy of a constellation is the average (over all the 2L + 1 



sections) erasure fraction. The bounds in (22i are equivalent 
to saying that both the tail as well as the flat part must have 



length at least L. From Lemma 26 for sufficiently small 5, 
the transition has length at most ^'^'^^'^^ (i.e., the number of 
sections i with erasure value, Xi, in the interval [S, Xs{e*)'-6]), 
a constant independent of L' . Informally, therefore, most of 
the length L' consists of the tail or the flat part. 

Let us now show all this more precisely. First, we show 
that the flat part is large, i.e., it is at least a fixed fraction of 
L'. We argue as follows. Since the transition contains only a 
constant number of sections, its contribution to the entropy is 
small. More precisely, this contribution is upper bounded by 
{L'^+i)s ■ Further, the contribution to the entropy from the tail is 
small as well, namely at most 6. Hence, the total contribution 
to the entropy stemming from the tail plus the transition is at 
most 



wc{l.r) 



1)5 



6. However, the entropy of the FP is equal to 



2 -. As a consequence, the flat part must have length 
which is at least a fraction ^ +^"(^) _ — S of L'. This 

fraction is strictly positive if we choose S small enough and 
L' large enough. 

By a similar argument we can show that the tail length 
is also a strictly positive fraction of L' . From Lemma 23 
Xs{e*) > x^^. Hence the flat part cannot be too large since the 

j,BP I ^ (l) 

entropy is equal to 2 ' which is strictly smaller than 

x^^. As a consequence, the tail has length at least a fraction 



1 



of L'. As before, this fraction is also 



2{x^'--S) L' + l 

Strictly positive if we choose 6 small enough and L' large 
enough. Hence, by choosing L to be the lesser of the length 
of the flat part and the tail, we conclude that the bounds in 
( |22] i are valid and that L can be chosen arbitrarily large (by 
increasing L'). 

Consider now the second case. In this case x* is a proper 



one-sided FP with entropy equal to 



and with param- 



eter e'"'(l,r) < e* < 1. Now, using again Theorem 30 we 
can show 

2rl^ 



4lr 



e* > e"*''(l,r)-2u.^ 



(1-4U! S)r 



1>3 



4lr 4- 



1 



> e"*"(l,r)-2w;- 



)- — )2 
(l-4«)" 5 )r 



(1-2-^)2 



To obtain the above expression, we take L' to be sufficiently 
large in order to bound the term in c(l, r, w, L) which contains 
L. We also use ( |22| i and choose 5 to be sufficiently small to 
bound the corresponding terms. We also replace w^'^/^ by 
y;-i/8 jjj c(l,r,u',L). 

To summarize: we conclude that for an entropy equal to 
X (i,r)+3:„(i) ^ g^ffi(.jgJJl■Jy large V , x* must be a proper 
one-sided FP with parameter e* bounded as above. 

Finally, let us show that e*(l — :;J78) is a lower bound 
on the BP threshold. We start by claiming that 



u-i/s) u^i/s) 



(r-2)(l-l) 



inf 

\<a<l 



{e,{a):te[-L,L]}. 



To prove the above claim we just need to check that 
e{xQ)x*_]^/xQ (see bounds in phase (ii) of Theorem 



30 1 is 



greater than the above infimum. Since in the limit of L' 00, 
e{xQ)x*_j^/xQ — > e*, for sufficiently large L' the claim is true. 

From the hypothesis of the theorem we have w > 2^^. 
Hence e*(l - Aw"^/^)''^ > 0. Apply forward DE (cf Defi- 
nition 13 I with parameter e < e*(l — 4w^^/^)'^-'- and length 
L. Denote the FP by x°° (with indices belonging to [-~L,L]). 
From Lemma[3T|we then conclude that x°° is point-wise upper 



bounded by x(j). But for a 
x^{l/4) <xo(l/4) - a 



: 1/4 we have 

-L'+L <S< Xu{l) 



Vz, 



where we make use of the fact that 6 can be chosen arbitrarily 
small. Thus x\°°'^ < Xa{l) for all i G [—L,L]. Consider a 
one-sided constellation, y, with yi = a;o(l/4) < a;u(l) for all 
i E [— L,0]. Recall that for a one-sided constellation y^ = i/q 
for alH > and as usual yi = for i < —L. Clearly, x*^""-* < 
y. Now apply one-sided forward DE to y with parameter e 
(same as the one we applied to g et af°) and call it's limit 



(00) 
(00) 



From part (i) of Lemma 
is either proper or trivia 



22 



we conclude that the limit 



Suppose that y°° is proper 
(implies non-trivial). Clearly, y°° < Xu{l) for all i e [— L, 0]. 
But from Lemma [23] we have that for any proper one-sided 
FP Uo ^ a^u(e) > a contradiction. Hence we conclude 

that y°° must be trivial and so must be x°°. 

VI. Discussion and Possible Extensions 
A. New Paradigm for Code Design 

The explanation of why convolutional-like LDPC ensembles 
perform so well given in this paper gives rise to a new 
paradigm in code design. 

In most designs of codes based on graphs one encounters 
a trade-off between the threshold and the error floor behavior 
E.g., for standard irregular graphs an optimization of the 
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threshold tends to push up the number of degree-two variable 
nodes. The same quantity, on the other hand, favors the 
existence of low weight (pseudo)codewords. 

For convolutional-like LDPC ensembles the important op- 
erational quantity is the MAP threshold of the underlying 
ensemble. As, e.g., regular LDPC ensembles show, it is simple 
to improve the MAP threshold and to improve the error- 
floor performance - just increase the minimum variable-node 
degree. From this perspective one should simply pick as large 
a variable-node degree as possible. 

There are some drawbacks to picking large degrees. First, 
picking large degrees also increases the complexity of the 
scheme. Second, although currently little is known about the 
scaling behavior of the convolutional-like LDPC ensembles, 
it is likely that large degrees imply a slowing down of the 
convergence of the performance of finite-length ensembles to 
the asymptotic limit. This implies that one has to use large 
block lengths. Third, the larger we pick the variable-node 
degrees the higher the implied rate loss. Again, this implies 
that we need very long codes in order to bring down the rate 
loss to acceptable levels. It is tempting to conjecture that the 
minimum rate loss that is required in order to achieve the 
change of thresholds is related to the area under the EXIT 
curve between the MAP and the BP threshold. E.g., in FigurejS] 
this is the light gray area. For the underlying ensemble this 
is exactly the amount of guessing (help) that is needed so 
that a local algorithm can decode correctly, assuming that the 
underlying channel parameter is the MAP threshold. 

Due to the above reasons, an actual code design will 
therefore try to maintain relatively small average degrees so 
as to keep this gray area small. But the additional degree of 
freedom can be used to design codes with good thresholds and 
good error floors. 



B. Scaling Behavior 

In our design there are three parameters that tend to infinity. 
The number of variables nodes at each position, called AI, the 
length of the constellation L, and the length of the smoothing 
window w. Assume we fix w and we are content with 
achieving a threshold slightly below the MAP threshold. How 
should we scale M with respect to L so that we achieve the 
best performance? This question is of considerable practical 
importance. Recall that the total length of the code is of order 
L ■ M. We would therefore like to keep this product small. 
Further, the rate loss is of order 1/L (so L should be large) and 
AI should be chosen large so as to approach the performance 
predicted by DE. Finally, how does the number of required 
iterations scale as a function of L7 

Also, in the proof we assumed that we fix L and let AI 
tend to infinity so that we can use DE techniques. We have 
seen that in this limit the boundary conditions of the system 
dictate the performance of the system regardless of the size of 
L (as long as L is fixed and AI tends to infinity). Is the same 
behavior still true if we let L tend to infinity as a function of 
All At what scahng does the behavior change? 



C. Tightening of Proof 

As mentioned already in the introduction, our proof is weak 
- it promises that the BP threshold approaches the MAP 
threshold of the underling ensemble at a speed of w^^^^. 
Numerical experiments indicate that the actual convergence 
speed is likely to be exponential and that the prefactors are 
very small. Why is the analytic statement so loose and how 
can it be improved? 

Within our framework it is clear that at many places the 
constants could be improved at the cost of a more involved 
proof. It is therefore likely that a more careful analysis 
following the same steps will give improved convergence 
speeds. 

More importantly, for mathematical convenience we con- 
structed an "artificial" EXIT curve by interpolating a particular 
fixed point and we allowed the channel parameter to vary as a 
function of the position. In the proof we then coarsely bounded 
the "operational" channel parameter by the minimum of all 
the individual channel parameters. This is a significant source 
for the looseness of the bound. A much tighter bound could 
be given if it were possible to construct the EXIT curve by 
direct methods. As we have seen, it is possible to show the 
existence of FPs of DE for a wide range of EXIT values. The 
difficulty consists in showing that all these individual FPs form 
a smooth one-dimensional manifold so that one can use the 
Area Theorem and integrate with respect to this curve. 

D. Extensions to BMS Channels and General Ensembles 

Preliminary numerical evidence suggests that the behavior 
of the convolutional-like LDPC ensembles discussed in this 
paper is not restricted to the BEC channel or to regular 
ensembles but is a general phenomenon. We will be brief. 
A more detailed discussion can be found in the two recent 
papers [26], [27]. Let us quickly discuss how one might want 
to attack the more general setup. 

We have seen that the proof consists essentially of three 
steps. 

(i) Existence of FP: As long as we stay with the BEC, 
a similar procedure as the one used in the proof of 
Theorem |27] can be used to show the existence of the 
desired FP for more general ensembles. 

General BMS channels are more difficult to handle, but 
FP theorems do exist also in the setting of infinite- 
dimensional spaces. The most challenging aspect of this 
step is to prove that the constructed FP has the essential 
basic characteristics that we relied upon for our later 
steps. In particular, we need it to be unimodal, to have 
a short transition period, and to approach the FP density 
of the underlying standard ensemble. 

(ii) Construction of EXIT Curve and Bounds: Recall that in 
order to create a whole EXIT curve, we started with a 
FP and interpolated the value of neighboring points. In 
order to ensure that each such interpolated constellation 
is indeed a FP, we allowed the local channel parameters 
to vary. By choosing the interpolation properly, we were 
then able to show that this variation is small. As long 
as one remains in the realm of BEC channels, the same 
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technique can in principle be applied to other ensembles. 
For general channels the construction seems more chal- 
lenging. It is not true in general that, given a constellation, 
one can always find "local" channels that make this 
constellation a FP. It is therefore not clear how an 
interpolation for general channels can be accomplished. 
This is perhaps the most challenging hurdle for any 
potential generalization, 
(iii) Operational Interpretation: For the operational interpre- 
tation we relied upon the notion of physical degradation. 
We showed that, starting with a channel parameter of 
a channel which is upgraded w.rt. to any of the local 
channels used in the construction of the EXIT curve, 
we do not get stuck in a non-trivial FP. For the BEC, 
the notion of degradation is very simple, it is the natural 
order on the set of erasure probabilities, and this is a total 
order For general channels, an order on channels still 
exists in terms of degradation, but this order is partial. 
We therefore require that the local channels used in the 
construction of the EXIT curve are all degraded w.r.t. a 
channel of the original channel family (e.g., the family 
of Gaussian channels) with a parameter which is only 
slightly better than the parameter which corresponds to 
the MAP threshold. 

E. Extension to General Coupled Graphical Systems 

Codes based on graphs are just one instance of graphical 
systems that have distinct thresholds for "local" algorithms 
(what we called the BP threshold) and for "optimal" algo- 
rithms (what we called the MAP threshold). To be sure, coding 
is somewhat special - it is conjectured that the so-called 
replica-symmetric solution always determines the threshold 
under MAP processing for codes based on graphs. Neverthe- 
less, it is interesting to investigate to what extent the coupling 
of general graphical systems shows a similar behavior Is 
there a general class of graphical models in which the same 
phenomenon occurs? If so, can this phenomenon either be used 
to analyze systems or to devise better algorithms? 

Acknowledgment 

We would like to thank N. Macris for his help in choosing 
the title and sharing his insights and the reviewers for their 
thorough reading and numerous suggestions. We would also 
Uke to thank D. J. Costello, Jr, P. Vontobel, and A. R. Iyengar 
for their many comments and very helpful feedback on an 
earlier draft. Last but not least we would like to thank G. D. 
Forney, Jr for handling our paper. The work of S. Kudekar was 
supported by the grant from the Swiss National Foundation no 
200020-113412. 



K. This circular definition symmetrizes all positions, which 
in turn simplifies calculations. 

As we will see shortly, most codes in this circular ensemble 
have a minimum stopping set distance which is a linear 
fraction of M . To make contact with our original problem 
we now argue as follows. Set K = 2L + 1. If, for the circular 
ensemble, we take 1 — 1 consecutive positions and set them 
to then this "shortened" ensemble has length 2L + 1 and it 
is in one-to-one correspondence with the (l,r, L) ensemble. 
Clearly, no new stopping sets are introduced by shortening the 
ensemble. This proves the claim. 

Let A{l,r,M,K,w) denote the expected number of stop- 
ping sets of weight w of the "circular" ensemble. Let C denote 
a code chosen uniformly at random from this ensemble. 

Recall that every variable node at position i connects to a 
check node at positions i — l, . . . + modulo K. There are 
M variable nodes at each position and M- check nodes at 
each position. Conversely, the Ml edges entering the check 
nodes at position i come equally from variable nodes at 
position i — 1, . . . , i + i. These Ml edges are connected to 
the check nodes via a random permutation. 

Let Wk, k e {0, . . . ,K - 1}, < Wk < M, denote the 
weight at position i, i.e., the number of variable nodes at 
position i that have been set to 1. Call w — {wq, . . . , wk-i) 
the type. We are interested in the expected number of stopping 
sets for a particular type; call this quantity ^(l, r, M, K, w). 
Since the parameters (l, r, M, K) are understood from the 
context, we shorten the notation to A{w). We claim that 



A{w) 



nK-l / A/1 



K-i (Af +1) ( E^^^)coef{p(x)*^? , } 

^ n 

fc=0 



(23) 



where p{x) = X^i^i (D^*- ^^^^ expression is easily explained. 
The Wk variable nodes at position k that are set to 1 can be 
distributed over the M variable nodes in (^^) ways. Next, we 

have to distribute the ^fc+i ones among the M^ check 

nodes in such a way that every check node is fulfilled (since 
we are looking for stopping sets, "fulfilled" means that a check 
node is either connected to no variable node with associated 
value "1" or to at least two such nodes). This is encoded by 
coef{p(a;)*^r ^ a;^»=-i ™''+' }. Finally, we have to divide by the 
total number of possible connections; there are Ml check node 
sockets at position k and we distribute X]i=-i "^k+i ones. This 
can be done in 



Ml 



) ways. To justify step (a) note that 



Appendix I 
Proof of Lemma[T] 

We proceed as follows. We first consider a "circular" 
ensemble. This ensemble is defined in an identical manner 
as the (l,r,i) ensemble except that the positions are now 
from to — 1 and index arithmetic is performed modulo 



n 



M 

Wk+, 



1 •"'fc+t 
i M 



<(M + 1) , 



M 



1 v^i 



18 



Note that, besides the factor {M + 1), which is negligible, 
each term in the product ( [23] l has the exact form of the 
average stopping set weight distribution of the standard (1, r)- 
ensemble of length M and weight \ ^fe- (Potentially 

this weight is non-integral but the expression is nevertheless 
well defined.) 

We can therefore leverage known results concerning the 
stopping set weight distribution for the underlying (l, r)- 
regular ensembles. For the (l, r)-regular ensembles we know 
that the relative minimum distance is at least d)(l,r) with 
high probability [13, Lemma D.17]. Therefore, as long as 
O? EL-i ^k+^ < ^(1, r), for anO<k<K,j^ log A{w) 
is strictly negative and so most codes in the ensemble do not 
have stopping sets of this type. The claim now follows since 
in order for the condition ^J2i=-i'^k+i < '^(iji') to be 
violated for at least one position k we need J2k=o 
exceed lw(l, r). 



K,(0.44) A. (0.44) K^jO.U) A* (0.44) 



Appendix II 
Basic Properties of h{x) 

Recall the definition of h{x) from (|2]l. We have. 
Lemma 32 (Basic Properties of h{x)): Consider the (l,r)- 
regular ensemble with 1 > 3 and let e e (e""", 1]. 

(i) h'{x^{e)) > and h'{x,{€)) < 0; \h'{x)\ < Ir for x G 
[0,1]. 

(ii) There exists a unique value < x^{e) < 2:u(e) so that 
h'{xt{e)) = 0, and there exists a unique value a;u(e) < 
x*{e) < x,{e) so that h'{x*{e)) = 0. 

(iii) Let 

K^e) = mm{~h (0), — — }, 

x*[e) 

K{e) = mm{h'{x,ie)), ^M^:^}, 
x„(e) - x^{fL) 

« (.) = mm{M-u(e)),^;,^^y-^}, 

yie) ^mm{-h'{x.{e)),^^^f^^}. 

Xs[e) - x*[e) 

The quantities K*(e), A*(e), K*(e), and A*(e) are non- 
negative and depend only on the channel parameter e and 
the degrees (l,r). In addition, K*(e) is strictly positive 
for all e e [0,1]. 

(iv) For < e < 1, 

x*(e) > 



(v) For < e < 1, 



K*(e) > 



^2 ■ 



(vi) If we draw a line from with slope ~k^, then h{x) lies 
below this line for x E [0,x^,]. 

If we draw a line from a;u(e) with slope A*, then h{x) 

lies below this line for all x E [x*, a::u(e)]. 

If we draw a line from Xu{e) with slope k*, then h{x) 

lies above this line for x G [a;u(e), a;*]. 

Finally, if we draw a line from a;s(e) with slope —A*, 

then h{x) lies above this line for all x e [x*,Xs(e)]. 
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Fig. 11. Pictorial representation of the various quantities which appear in 

Lemma [32| We use the (3, 6) ensemble to transmit over a BEC with erasure 
probability e = 0.44. The function h{x) = 0.44(1 — (1 — x)^)'^ — a; is 
represented in the figure by the smooth bold curve. The roots of h{x) = 
or, equivalently, the FPs of DE are given by 0, a\i(0.44) ft: 0.2054, and 
a;s(0.44) ^ 0.3265. There are only two stationary points of h{x), i.e., only 
two points at which h' {x) = 0. They are given by a;, (0.44) 0.0697 and 
x*(0.44) Si 0.2673. Along with the curve h(x), the figure contains three 
dashed lines representing the tangents at the points 0, 2.'u(0.44) and Xs(0.44). 
The slopes of the tangents at 0, a::u(0.44) and a;s(0.44) are /i'(0) = -1, 
h'{xyi) = 0.1984 and h' (xs) = —0.2202, respectively. Also shown are the 
four lines which bound h{x) in the various regions. These lines are given 
(their end-points) by: {(0, 0), (x, , /i(2:, ))} , {(x,, /i(a;*)), (a;i,(0.44), 0)}, 
{(xu(0.44),0),(a;*,/i(x*))} and {(x*, h(a;*)), (xs(0.44), 0)} and have 
slopes -0.4191, 0.2157, 0.1048 and -0.1098 respectively. Thus we have 
K*(0.44) = 0.1048, A*(0.44) = 0.1098, k,(0.44) = 0.4191 and 
A, (0.44) = 0.1984. 



Example 33 ({3, 6)-Ensemble): Consider transmission us- 
ing a code from the (3, 6) ensemble over a BEC with e = 0.44. 
The fixed point equation for the BP decoder is given by 

X = 0.44(1 - (1 -xf)^. 

The function h{x) — 0.44(1 — (1 — x)^)^ — a; is shown in 
Figure 11 The equation h{x) = has exactly 3 real roots, 
namely, 0, a;u(0.44) « 0.2054 and a;s(0.44) « 0.3265. Further 
properties of h{x) are shown in Figure [TT| 
Let us prove each part separately. In order to lighten our 
notation, we drop the e dependence for quantities like Xu, Xs, 
x.^,, or X*. 

(i) Note that h{x) > for all x E (x^^jX^), with equality 
at the two ends. This implies that h'{xa) > and that 
h'{xs) < 0. With respect to the derivative, we have 



\h\x)\ = |6(l-l)(r-l)(l-x)-^(l-(l-x)-i)^ 
< (1 - l)(r- 1) + 1 < Ir. 



-11 



(ii) We claim that h" (x) = has exactly one real solution in 
(0, 1). We have 

h"{x) = 

e(l-l)(r-l)(l-x)-3(i_(i_^.)r-i)i-3 
X (l-x)''"^(lr-l-r) -r + 2 
Thus h"{x) = for a; e (0, 1) only at 

x=l-( 

Vlr - 1 - r/ 

Since 1 > 3, the above solution is in (0, 1) 



(24) 



(25) 
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Since h{0) = h{xa) = h{Xf) = 0, we know from Rolle's 
theorem that there must exist an < a;* < Xu and an 
x^ < X* < Xs, such that h'{x^) — h'{x*) — 0. 
Now suppose that there exists a y e (0, 1), x-j, ^ y ^ x*, 
such that h'{y) = 0, so that h'{-) vanishes at three distinct 
places in (0, 1). Then by Rolle's theorem we conclude 
that h"{x) = has at least two roots in the interval 
(0, 1), a contradiction. 

(iii) To check that the various quantities in part (iii) are strictly 
positive, it suffices to verify that h{x^,) ^ and h{x*) ^ 
0. But we know from Lemma|9]that h{x) = has exactly 
two solutions, namely Xu and Xs, and neither of them is 
equal to x* or x^ since h'{xy) > 0. 

(iv) From ( |24] i, for all x e [0, 1] we can upper bound 
by 

(l-l)(r-l)[lr-l-r-r+2] < l^r^. (26) 

Note that h'{0) = -1 and, by definition, ^ 0, 

so that — = (^»)~'' (0) ^ Consider the function h'(x), 
X G [0, a;*]. From the continuity of the function h'{x) and, 
using the mean-value theorem, we conclude that there 
exists an G (0, a;*) such that h"{r]) = h'{x,)-h'{o) ^ 
from ( p6| ) we know that h"{ri) < l^r^. It follows that 

J_ _ Vu-.)-h'(O) -^2^2 
x^ 

(v) To get the universal lower bound on (e) note that the 
dominant (i.e., smaller) term in the definition of K*(e) is 
— ^^p^^y^. (The second term, —h'{0), is 1.) Recall that 

is the point where h{x) takes on the minimum value in 
the range [0, 2;u(e)]- We can therefore rewrite K*(e) in the 
form ^ maxQ<2.<^^(£-){— To get a lower bound 
on K*(e) we use the trivial upper bound a;*(e) < 1. It 
therefore remains to lower bound maxo<a;<a;„(e){— 
Notice that —h{x) is a decreasing function of e for every 
X £ [0,a;u(l)]. Thus, inserting e = 1, we get 

max \x~(l^(l^xY-^y-^] 

0<x<x^{l) 

= max \(x^^)^-^ - (1- (l-xY-^)^-^] 

0<x<x,{l) 

1 1-2 

> max (x'^ — (t — 1)x)x^^. 

0<x<{r-l) 1-2 

Let us see how we derived the last inequality. First we 

1-1 1 
claim that for x G [0, (r — 1) i-^J we have a;i-i > 

(r — l)a; > 1 — (1 — xY^^. Indeed, this can be easily 

seen by using the identity 1 — (1 — a;)"^^^ = + + 

• • ■ + {l-xy-'^) andx < 1. Then we use A^-'^-B^-^ = 

{A - B){A^-^ + A^-^B + ■■■ + B^-^) >{A- B)A^-^ 

for all < i? < j4. Finally we use 

= (1 - (1 - xu(l))^-^) < (r - 

so that 

2:u(l)> (27) 

As a consequence [0, (r — C [0,a;u(l)] and hence 
we get the last inequality. Now we can further lower 
bound the right-hand-side above by evaluating it at any 
element of [0, (r - 



1-1 1-1 
We pick X = 2 i-^ (r — 1) i-^ . Continuing the chain of 

inequalities we get 

a;=£ , , 1 1-2 

> (2i-i(r-l))-r^(i)i^ 

= (2^-i(r-l))-A(2-i(r-l)-i) 

_ 1 1 1 

~ 2^(r-l)^ - - 8r^' 

Since 1 > 3 we have < 3 and < 2. Hence we 
obtain (a). 

(vi) Let us prove that for all x G [x^^x*), h{x) is strictly 
above the line which contains the point (xu,0) and has 
slope K* . Denote this line by l{x). More precisely, we 
have l{x) = k*{x — x^)- Suppose to the contrary that 
there exists a point y G {x^^x*) such that h{y) < l{y). 
In this case we claim that the equation h{x) — l{x) — 
must have at least 4 roots. 

This follows from (a) h{xu) = ^a^u), (b) h'{xu) > 
l'{x„), (c) h{y) < l{y), (d) h{x*) > l{x*), and, finally, 
(e) h{l) < where x^ < y < x* < 1. If all 

these inequalities are strict then the 4 roots are distinct. 
Otherwise, some roots will have higher multiplicities. 
But if h{x) — l{x) = has at least 4 roots then h"{x) — 
l"{x) = has at least 2 roots. Note that l"{x) = 0, since 
l{x) is a linear function. This leads to a contradiction, 
since, as discussed in part (ii), h"{x) has only one (single) 
root in (0, 1). 

The other cases can be proved along similar lines. 

Appendix III 
Proof of Lemma|26] 

We split the transition into several stages. Generically, in 
each of the ensuing arguments we consider a section with 
associated value just above the lower bound of the corre- 
sponding interval. We then show that, after a fixed number 
of further sections, the value must exceed the upper bound 
of the corresponding interval. Depending on the length L and 
the entropy of the constellation there might not be sufficiently 
many sections left in the constellation to pass all the way to 
Xs(e) — 5. In this case the conclusion of the lemma is trivially 
fulfilled. Therefore, in the sequel, we can always assume that 
there are sufficiently many points in the constellation. 

In the sequel, K*(e) and a;*(e) are the specific quantities 
for a particular e, whereas and the strictly positive 

universal bounds valid for all e, discussed in Lemma |32l We 
write and a;* instead of ^ and to emphasize their 
operational meaning. 

(i) Let (5 > 0. Then there are at most w{-^ + 1) sections i 
with value Xi in the interval [S,x^{e)]. 
Let i be the smallest index so that Xi > 6. If a;j_|_(^_i) > 
x^.{e) then the claim is trivially fulfilled. Assume there- 
fore that < a;*(e). Using the monotonicity of 

Xi — €g{^Xi—(^i^—i'^ , . . . , , . . . , Xij^(^ijj—i'j ) 
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This implies 

I2I) Lemma [32] (vi) 

> -^(xi) > -1{S) = K^{e)S. 
This is equivalent to 

> + K*(e)5. 

More generally, using the same line of reasoning, 

> + lK^{e)6, 
as long as Xi+i{^_i) < a;*(e). 

We summarize. The total distance we have to cover is 
x^, — S and every {w — 1) steps we cover a distance of 
at least K^{e)S as long as we have not surpassed x^{e). 
Therefore, after {w — l)LirXc)FJ steps we have either 
passed x^ or we must be strictly closer to x^ than {e)S. 
Hence, to cover the remaining distance we need at most 
{w — 2) extra steps. The total number of steps needed is 
therefore upper bounded by w — 2 + (w ~ \, 

which, in turn, is upper bounded by w{ ^'^l^g + !)■ The 
final claim follows by bounding (e) with 1 and (e) 
by K*. 

(ii) From a;*(e) up to a;j,(e) it takes at most w{ ^^ -^2 + 2) 
sections. 

Recall that x^ is defined by Xi = J2j.k=o^i+3-k- 



From Lemma 24 (i), Xi < eg{xi, Xi, . . . , Xi) = Xi+h{xi). 



Sum this inequality over all sections from —00 to fc < 0, 



E 



k 

E 



h{xi). 



Writing J2i=-oD terms of the Xi, for all i, and 

rearranging terms, 

- ^ ^ ~2 E ^2^j{xk+i-Xk-i+l) 

i——OQ i—1 

W . 

< -^\Xk^-{w-\) — Xk-{w-l))- 



Let us summarize: 



Xk+{w-l) ^ Xk-(w-l) 



> 



E 



(28) 



From (i) and our discussion at the beginning, we can 
assume that there exists a section k so that a;*(e) < 
Xk-(w-i)- Consider sections Xk-{w-i), ■ • • , Xk+(w+i), so 
that in addition Xk+(w~i) < Xu{e)- If no such k exists 
then there are at most 2iii — 1 points in the interval 
[a;, (e), Xu(e)], and the statement is correct a fortiori. 
From (|28l) we know that we have to lower bound 



-^J2i=^ooKx^)- Since by assumption Xk+(^^i) < 
Xa{e), it follows that x^ < Xa{e), so that every contri- 
bution in the sum X]^-oo ^^i^i) is positive. Further, 
by (the Spacing) Lemma pS] w{xi — Xi-i) < 1. Hence, 



6 



k 

E 



E 

i— — 00 



> -6 h{xi){xi - Xi-i). (29) 



Since by assumption x*(e) < it follows that 

Xk > 2;*(e) and by definition x^oo = 0. Finally, 
according to Lemma 32 (iii), —h{x) > K^{e)x for x G 
[0,a;*(e)]. Hence, 
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k 

E 

2 — — 00 



h{xi){xi - Xi-i) > 6k* (e) 



xdx 



^^.i^){x^.ie)r. 



(30) 



The inequality in ( |30) l follows since there must exist a 
section with value greater than and smaller than 

a;*(e). Indeed, suppose, on the contrary, that there is no 
section with value between { '^*^'^^ , x^, (e)). Since x^ > 
a;*(e), we must then have that Xk — Xk-i > 



by the Spacing Lemma 25 we have that Xk — Xk- 



This would imply that 



> 



But 

1 < 

In other words. 



w < 



. Using the universal lower bound on x*(e) 



from Lemma 32 (iv), we conclude that w < 21 r , a 



contradiction to the hypothesis of the lemma. 



Combined with (28 1 this implies that 

3 



Xk+(w-l) ~ Xk-{w-l) 



We summarize. The total distance we have to cover is 
Xui^) — a;*(e) and every 2{w — 1) steps we cover a 
distance of at least {e){x^ (e))^ as long as we have not 
surpassed Xti{e). Allowing for 2{w — 1) — 1 extra steps to 
cover the last part, bounding again w — 1 by w, bounding 
Xu(e) — 2;*(e) by 1 and replacing K*(e) and a;*(e) by their 
universal lower bounds, proves the claim. 



(iii) From Xu{e) to Xu{e) 
sections. 



4(l + 121r) 



it takes at most 2w 



Let k be the smallest index so that Xuie) < Xk-(w-i)- It 
follows that Xk^2w+i < Xu{e) < Xk- Let k be the largest 
index so that Xj, < Xa{e). From the previous line we 
deduce that fc — 2i(; + 1 < A; < fc, so that fc — fc < 2w — 1. 
We use again (j28|. Therefore, let us bound 
h{xi). We have 



use 

Ek 
i— — c 



k k k 

^ ^ 6 

y^h{xi)^ h{xi) h{xi) 

i=-oo i=-oo i=fc+l 

(a) 3 

> -K4e)(a;*(e))2-12lr(a;fc+(,„_i)-a;u(e)). 

We obtain (a) as follows. There are two sums, one from 
—00 to fc and another from fc + 1 to fc. Let us begin with 
the sum from —00 to fc. First, we claim that Xj^ > ^'^'^'^ . 



x.(c) 



Indeed, suppose Xf. < ^ 
of fc. 



Then, using the definition 



k+l 



(r- 1)"^ 
- 2 



> 



2r2' 



25 



^k + l 



Xt < -, 
k — 



But from (the Spacing) Lemma 
a contradiction, since from the hypothesis of the lemma 
w > 2r2. Using (|29]) and (l30| with the integral from to 
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x*(e)/2 we get the first expression in (a). Note that the 
integral till x^, (e) /2 suffices because either < (e) or, 
following an argument similar to the one after ([30|, there 
must exist a section with value between { ^*^^ ~^x^,{e)). 
We now focus on the sum from k + 1 to k. From the 
definition of k, for all i E [k + l,k], \h{xi)\ < lr{xi — 
Xu(e)). Indeed, recall from Lemma 32 that < Ir 

for X E [0, 1]. In particular, this implies that the line with 
slope Ir going through the point (xu(e),0) lies above 
h{x) for X > a;u(e). Further, Xi — x^ie) < x^ — a;u(e) < 
Xk+w-i — Finally, using k — k < 2w — 1 we get 

the second expression in (a). 
From (|28l) we now conclude that 



-1 - Xu{e) > 
3 



K*(e)(a;*(e)) - 12lr(a;fc+^ 



which is equivalent to 



Xk+(w-i) - Xu{e) > 



3K,{e){x,{e)f 



4(l + 12lr) 

The final claim follows by replacing again K*(e) and 
x^{e) by their universal lower bounds k» and x^. 
(iv) From a;„(e) + ^(\''_}i2l-r) fo Xs{e) — 6 it takes at most 



S min{At""",A"""} 



4(l + 121r) 

Steps, where 



mm K 

™<e<l 



A" 



min A* (e) . 

™<e<l 



From step (iii) we know that within a fixed number of 
steps we reach at least 4^(i!/i2ir i ab ove a;u(e). On the 
other hand we know from Lemma 23 that xq < Xs{e). We 



conclude that a;s(e) — Xu{e) > i|r+mj) ■ From Lemma 
we know that Xs(e'"') — a;u(e'"') = and that this distanc'e 
is strictly increasing for e > e'"'. Therefore there exists a 
unique number, call it e™", e™" > e'"'(l,r), so that 

Xs{e) - a;u(e) > 
if and only if e > e' 



4(1 + 12lr)' 
'™" As defined above let. 



A"'" 



min A*(e). 

™<£<1 



Since e™" > e'"'(l, r), both k™" and A™'" are strictly pos- 
itive. Using similar reasoning as in step (i), we conclude 
that in order to reach from x^^ (e) + '^'^-^i^^-^x) (e) ~ ^ 
it takes at most w ^'^'i-^"i^|!i'l'2in-i steps, where we have 
used the fact that, by assumption, 8 < ^^"^121^) • 
From these four steps we see that we need at most 



< w 



= w- 



K* 

c(l,r) 



1 



min{K™'", A™'"} 3k* (x*) 2 

1 2 
^ min{K™'", A™'"} ^ 3k* (a;*)^ 



5]) 
^5] 



sections in order to reach Xs(e) — S once we reach S. This 
constant depends on (l,r) but it is independent of L and e. 



Appendix IV 
Proof of Theorem[27] 

To establish the existence of x with the desired properties, 
we use the Brouwer FP theorem: it states that every continuous 
function / from a convex compact subset S* of a Euclidean 
space to S itself has a FP. 

Let z denote the one-sided forward DE FP for parameter 
e = 1. Let the length L be chosen in accordance with the 
the statement of the theorem. By assumption L > t^-t. Using 
part (ii), we conclude that x{z) > 5(1 — 7), i-C-, z 
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Lemma 

is non-trivial. By Lemma |22| part (i), it is therefore proper, i.e.. 
it is non-decreasing. Suppose that x{z) < X- I" this case, it is 
easy to verify that the second statement of the theorem is true. 
So in the remainder of the proof we assume that x{z) > X- 

Consider the Euclidean space [0,1]^"''^. Let S{x) be the 
subspace 



Six) -ke [0,1]"^+' :xfe) = x;a:^ 

x_L < a;_L+i < • • • < a;o}. 



<z,,ie [-L,0]; 



First note that S{x) is non-empty since z is non-trivial and 
has entropy at least x- We claim that S{x) is convex and 
compact. Indeed, convexity follows since S{x) is a convex 
polytope (defined as the intersection of half spaces). Since 
S{x) C [0, 1]^+^ and S{x) is closed, S{x) is compact. 

Note that any constellation belonging to S{x) has entropy 
X and is increasing, i.e., any such constellation is proper Our 
first step is to define a map V{x) which "approximates" the 
DE equation and is well-suited for applying the Brouwer FP 
theorem. The final step in our proof is then to show that the 
FP of the map V{x) is in fact a FP of one-sided DE. 

The map V{x) is constructed as follows. For x £ S{x), let 
U{x) be the map, 

{U{x))i = g{x.,^^,+i, . . . ,a;j+^,_i), i e [-L,0]. 

Define V : S{x) S{x) to be the map 



V{x)^ 



a{x)Uix) + (1 



where 



a{x) 



aix))z, 
X{z) - X 



X < x{U{x)), 
otherwise, 



x{z)-x{U(x))- 



Let us show that this map is well-defined. First consider the 
case X ^ x(C^(3i))- Since x E S'(x), x<z (componentwise). 
By construction, it follows that U{x) < U{z) = z, where the 
last step is true since z is the forward FP of DE for e = 1. 
We conclude that U{x)r^^j^^ < z. Further, by construction 
X(^(g) y(^(a)) ) = X- It is also easy to check that U{x) is 
non-negative and that it is non-decreasing. It follows that in 
this case V{x) G S{x)- 

Consider next the case x > x{U{x)). As we have seen, 
a; < z so that x{U{x)) < x{U{z)) — x{z)- Together with 
X > x{U{x)) this shows that a{x) e [0, 1]. Further, the choice 
of a{x) guarantees that x{V{x)) = x- It is easy to check that 
V{x) is increasing and bounded above by z. This shows that 
also in this case V{x) G S(x)- 

We summarize, V maps S{x) into itself. 
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In order to be able to invoke Brouwer's theorem we need to 
show that V{x) is continuous. This means we need to show 
that for every x G S{x) and for any e > 0, there exists a 5 > 
such that if y e B{x, S) f] S{x) then \\V{y) - V^(^)||2 < £■ 

First, note that U{x) and x(2i) continuous maps on 
S{x)- As a result, x{U{x)), which is the composition of two 
continuous maps, is also continuous. 

Fix X E S{x)- We have three cases: (i) x{U{x)) > %, (ii) 
xiUix)) < X, and (iii) x(f7(^)) = x- 

We start with (i). Let p — x{U (x)) — x and fix e > 0. From 
the continuity of xiU{x)) we know that there exists a ball 
B{x, vi) of radius > so that if y € B{x, i^i) n S{x) then 
\x{U{x)) ~ x(C/(y))| < p, so that ~x{V{y)) > x- It follows 
that for those y, V{y) = U{y)^^jfj^. 

For a subsequent argument we will need also a tight 
bound on \x{U {x)) — x{U {y)) \ itself Let us therefore choose 
7 = min{e, p], 7 > 0. And let us choose vi so that if 
y e B{x,y,) n Six) then \x{U{x)) ~ x{U{y))\ < ^^i^, 
so that x{U{y)) > x- 



We can now invoke Brouwer's FP theorem to conclude that 
y(-) has a FP in S{x), call it x. 

Let us now show that, as a consequence, either there exists 
a one-sided FP of DE with parameter e ~ 1 and entropy 
bounded between — — ^^^^ ^ii(i)) _ ^^^itu^ ^jj^j -^^ or x itself 



8 2r(L+l) 

is a proper one-sided FP of DE with entropy x- Clearly, either 
X ^ x(U{x)) or x{U{x)) < x- I" first case, i.e., if 
X < x{U{x)), then x = V{x) = C/(x)^^^^^. Combined 
with the non-triviality of x, we conclude that a; is a proper 



one-sided FP with entropy x and the channel parameter (given 
— — A less than or equal to 1. Also, from Lemma 



by 
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xiU{x))J ____ 

then conclude that the channel parameter is strictly greater 
than e'"'(l,r). 

Assume now the second case, i.e., assume that x{U{x)) < 
X- This implies that 

X — a{x){U{x)) + (1 — a{x))z. 

But since x<z, 



Further, since U{-) is continuous, there exists z/2 > such a{x)x+ (1 - a{x))z >x = a{x){U{x)) + (1 - a{x))z. 



thatforally e B(x,i/2)nS'(x), \\U{x)~Uiy)\ 
V — xmvJyVx, 1^2}- Then for all y e B{x, v) n S{x), 



2<l 



Choose 



\\V{x)^V{y)h 
U{x) 



X 



U{x) 



U{y) 



< 



xiU{x))>x 
< 



x{U{x)) 

U{y) 



< 



x{U{x)) x{U{x)) 
\\U{x)-U{y)h 



x{U{y)) 
Uiy) 



U{y) 



As a result, x > {U{x)). We will now show that this implies 
the existence of a one-sided FP of DE with parameter e — 1 
and with entropy bounded between '"^ "'^^'^ ^"W) iw 
and X- 



2r(L+l) 



xiUix)) 
\U{y)h 



Let 



and define x'^-' 



[/(a;(^-i)), £ > 1. 



x{U{y)) 
x{U{x))-x{U{y)) 



< 



X 



x{U{x)) - x{U{y)) 



IX 



■I X 2(L + 1) 

where above we used the bound ||C/(y)||2 < (-^ + !)■ 

Using similar logic, one can prove (ii). 

Consider claim (iii). In this case x(^(3i)) — X^ which 
implies that V{x) = U(x). As before, there exists Q <v\ such 
that for all y G B[x,v{) n S'(x), \\U{x) - \J{y)\\2 < §■ Let 
7 = min{x(z) — Xj x}- Since we assumed that x{z) > x, we 
have 7 > 0. Furthermore, there exists < 1/2 such that for all 
yeB{x,iy2)nS{x), \xiU{x))^x{Uiy))\ < 2(^. Choose 
v = min{i/i, 1^2}. Consider y £ B(x, i/) n S{x)- Assume first 
that x{U{y)) > X- Thus, as before, 

\\V{x)-V{y)\\2<e. 

Now let us assume that x{U{y)) < X- Then we have 

11^(3;) - V{y)\\2 = \\U{x) - a{y)U{y) - (1 - a{y))z\\2 
< a{y)\\U{x) - U{y)\\2 + |1 - a{y)\\\Uix) ~ Uiz)\\2 
x{U{y))-x{U{x)) 



By assumption, x > U{x), i.e., a;(°) > x^^\ By induction 
this implies that x^^~^^ > x^^\ i.e, the sequence x^^^ is 
monotonically decreasing. Since it is also bounded from below, 
it converges to a fixed point of DE with parameter e = 1, call 
it 

We want to show that x^°°^ is non-trivial and we want to 
give a lower bound on its entropy. We do this by comparing 
x^^^ with a constellation that lower-bounds x'^^ and which 
converges under DE to a non-trivial FP. 

We claim that at least the last N — [L 
components of x are above 



X±£u(l). 
2 



X{L + 1) = x{x){L + 1) < iV + (L + 1 - TV) 



X + 2;u(l) 



where on the right hand side we assume (worst case) that the 
last N components have height 1 and the previous [L^l — N) 
components have height . If we solve the inequality for 



N we get > (i + 1) 



>(L+1) 



x-^.(i) 



Consider standard DE for the underlying regular (l,r) 
ensemble and e = 1. We claim that it takes at most m 



max{ 



«;*(l)(x-x,(l))' A*(l)(l 



< 



< 



XU) - x{U{y)) 



76 



xU) - X 



< e, 



where above we used: (i) ||C/(a;) — C/(2)||2 < L + 1, (ii) 



x{U{y)) < X, (iii) xiUix)) 



X (when we explicitly write 



DE steps to go from the value to a value above 

—2^. The proof idea is along the lines used in the proof of 
Lemma 26 Consider the function h{x) as defined in ^ for 
e = 1. Note that x^{l) < 2th|i(i) and that ^ < x,{l) = 1. 
Further, the function h{x) is unimodal and strictly positive in 
the range (a;u(l), Xs(l)) and h{x) is equal to the change in x 
which happens during one iteration, assuming that the current 
^^^^•'-^'f > iii then the statement is trivially 



value is x. If 
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true. Otherwise, the progress in each required step is at least 
equal to 

1 



min{M'^i|^),M^)} 



>min{K*(l)( 



X- 



"""(^)-x.(l)),A*(l)(l-^ 



2 ^ " 2 

We use Lemma |32] part (vi) to get the last inequality. The 
claim now follows by observing that the total distance that 
has to be covered is no more than 1. 

Consider the constellation y*^''', which takes the value 
for hi, -TV] and the value 2Ch^ for [-N + 1,0]. By 
construction, y = y^"-' < x^^^^ = x. Define y*^*") — [/(y^''^^-'), 
£ > 1. By monotonicity we know that C/(y(^)) < (and 
hence y(°°) < In particular this is true for £ — m. But 

note that at least the last N — wm positions of y^™^ are above 
Also, by the choice of L, N — wm > N/2. 
Define the constellation w^^"^ which takes the value for 
[-i,-7V/2] and the value ^ for [-iV/2 + 1,0]. Define 
— ^^^U{'i/^^^''), £ > 0. Again, observe that by definition 
and < 1, hence we have < y(°°'. From 



V 

i;(0) < 



Lemma 
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we know that for a length N/2 

and a channel parameter the resulting FP of forward DE 
has entropy at least 

, 1- i 



Iw 



X 



r(x-xu(l))(i + l) 



> 0. 



Above, x' > follows from the first assumption on L 
in the hypothesis of the theorem. It follows that has 
(unnormalized) entropy at least equal to 'x'{N/2) and therefore 
normalized entropy at least x (x-xni^)) ^ 

Since x^°°'> > y(°°) > we conclude that x'^°°^ is a 

1 with entropy bounded 



one-sided FP of DE for parameter e 
between ii^ilip^ - ^ and x- 

Appendix V 
Proof of Theorem[301 

(i) Continuity: In phases (i), (ii), and (iv) the map is dif- 
ferentiable by construction. In phase (iii) the map is 
differentiable in each "period." Further, by definition of 
the map, the (sub)phases are defined in such a way that 
the map is continuous at the boundaries. 

(ii) Bounds in Phase (i): Consider a e [|, 1]. By construction 
of the EXIT curve, all elements Xi{a), i S [— L, 0], 
are the same. In particular, they are all equal to xo{a). 
Therefore, all values ei{a), i G [—L + w — 1,0], are 
identical, and equal to eo{a). 

For points close to the boundary, i.e., for i E [— i, —L + 
w — 2], some of the inputs involved in the computation of 
ei{a) are instead of xo{a). Therefore, the local channel 
parameter ei{a) has to be strictly bigger than 60(0;) in 
order to compensate for this. This explains the lower 
bound on ei{a). 

(iii) Bounds in Phase (ii): Let i E [—L,Q] and a E [|i|]- 
Then 

x*^L < = ei{a)g{xi-nj+iia), . . . ,x^+tL,-l{a)) 



< e,{a)g{xQ,...,Xo) = e^{a) 



This gives the lower bound ei{a) > e{xQ)^^. 
(iv) Bounds in Phase (iii): Let a E [j, -k] and i E [—L,0]. 
Note that xq{^) ~ x^ but that a;o(j) ~ x*-l'+l- The 
range ^] is therefore split into L' — L "periods." In 
each period, the original solution x* is "moved in" by 
one segment. Let p E {1, . . . , L' — L} denote the current 
period we are operating in. In the sequel we think of 
p as fixed and consider in detail the interpolation in 
this period. To simplify our notation, we reparameterize 
the interpolation so that if a goes from to 1, we 
moved in the original constellation exactly by one more 
segment. This alternative parametrization is only used in 
this section. In part (vi), when deriving bounds on e*, we 
use again the original parametrization. 
Taking this reparameterization into account, for a E 
[0, 1] ^ 



according to Definition 28 



Xt{a) 




-p+i) 



iE[-L,0], 
i < -L. 



We remark that Xi{a) decreases with a. Thus we have for 

any a, Xi{l) < Xi{a) < Xi{0). By symmetry, Xi{a) = 
X-i{a) for i > 1. 

We start by showing that if Xi(a) > 7 and i E [—L + 
w-l,-iu + l] then ei{a)/e* < 1 + For a E [0, 1], 
define 

1 ^ N 



Further, define 

\ w 



k=0 



k=0 



r-1 



Note that the values x* in the last definition are the values 
of the one-sided FP. In particular, this means that for 
i > we have x* — Xq. 

From the definition of the EXIT curve we have 

Xt{a) 



1 - i;E7=o 



(31) 



By monotonicity, 

w J ~ \ w J e* 

In the first step we used the fact that —L + w — l<i< 
—w + 1 and the second step is true by definition. 
Substituting this into the denominator of pT| ) results in 



^ ^i-p+i ^ " ^ ^i-p+i ^ ^ 



%—p 



i—p 



where we defined {/S.x*)i — x* ~ If we plug 

the upper bound on (Aa::*)i_p+i due to (the Spacing) 



1 



Lemma 25 into this expression we get 

1 1 

< , 



a;*_p+i/(Aa;*)^_p+i - 1 



\ e' ) (l-l)(r-l) 
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By assumption Xi{a) > 7. But from the monotonicity we 
have x*_p^i = Xi (0) > x^{a). Thus 

•^i—p+i ^ 7- This 

is equivalent to 



W 



J (r-l)(l-l) 
As a consequence, 



1 > w^l"^. (32) 



< 1 



1 



a:*_p+i/(Aa;*), - 1 



< 1 



,1/8' 



the promised upper bound. 

Let us now derive the lower bounds. First suppose that 
Xi{oi) > 7. For i e [— L, 0] we can use again monotonic- 
ity to conclude that 



1- 



ELoVi+j(a)\i-i 



This proves that 



i—p+l 



^ -^i-p+l -^i-p+l 

Note that this sequence of inequalities is true for the 
whole range i E [— i, 0]. Since x*_^-^ = 2:^(0) > Xi(a), 
we have x*_p_^_i > 7 and using ( |32j i we have 

{Ax*)i^p+i 



< 



'^i-p+l 



1 



As a consequence, 

> _ (Ax*),_p+i ^ ^ 1 



It remains to consider the last case, i.e., we assume that 



Xi{a) < 7. From Lemma 24 (iv) we have 

fc=0 j,k=0 



fe=0 j,k=0 



i~p~\-j — k ' 



and 



i^-p+i/n^' > 



i—p+w — k 



k=0 



W 



2 -^i-p+l+j-k- 
j.k=0 



We start with ( [ST) . Write 2:^(0;) in the numerator explic- 
itly as {x*_p)" {x*_p^i)^~°' and bound each of the two 
terms by the above expressions. This yields 



VZ^j,fc=0 ^i-p+j-k) \2^jM=0 ^i-p+l+j-k) 

1- ;;ET=oV»+j(a) 



Applying steps, similar to those used to prove Lemma 24 
(ii), to the above denominator, we get: 



1 



^ w — 1 

a) < — ^ Xi+j^k{a) 



j.k=t) 



^ w—1 



■''i-n^i-fej y-'^i-p+l+j-k) 



"i-p+J- 

],k=0 

Combining all these bounds and canceling common terms 
yields 

w—1 

Xi 



>(l--^. 



w 



^i—p+w—k 



k=0 



iJ2jk=Q ^ 



i-p+j-kl \/-^j,k=Q-^i-p+l+j-k 
Z^j^k=a\'^i-p+j-k/ \-^i~p+l+j-kJ 

Applying Holder's inequalit}]^ we get 

f^w-l * \a(\^w-l * \l-a 

\l^j.,k=0 -^i-p+j-k) \l^j,k=0 -^i-p+l+j-k) 

Z^j"=0 A^k=0 y-^i-p+j-k) y-^i-p+l+j-kl 

Putting everything together we now get 



'i—p-\-'w — k 



r-2 



(33) 



> 1. 



(34) 



fe=0 



By assumption Xi{a) < 7. Again from monotonicity we 
have Xi{a) > Xi{l) = x*_p. Thus x*_p < 7. Combining 
this with Lemma |24] (iii) and ( [T9] l in the hypothesis of 
the theorem, we obtain 



(r- 1)(1- l)(l + i«i/8) 



w — 1 



W 



^ y,2 <-P 



+j-k- 



i,k=0 



Suppose that x* , r t/ri > — ttr- Then from the 
above inequality we conclude that 

(r-l)(l-l)(l + wi/8) 1 , 

where we set to zero all the terms smaller than 



X 



i—p-\-w— [lo^/^] 

we get 



Upper bounding (1 + w'^/^) by 2w^/^ 
4(r-l)(l-l) >wi/2. 



But this is contrary to the hypothesis of the theorem, w > 



Hence we must have x 
Therefore, 

II! — 1 



< 



1 /w- [w^/®] 



^For any two n— length real sequences (ao, ai, . . . , On— 1) and 
(bo, bi, . . . ,b„-i) and two real numbers p,q £ (1, 00) such that - + - = 1, 
Holder's inequality asserts that 



w— 1 



- in 2^ ^i-p+w-k^ 



Ii — J. IL — ± 1 71 — 1 1 

Ei«fc''^i< (Eiffel") '(Ei^fci')' 

" " " fc=0 



fe=0 
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where we replace 



P+i' 



by 



1 

1,1/S 



and the remaining [w^^®] + 1 values by 1. Thus we have 

^ w—l 

-E 



< 



fc=0 



Using w > 2^® and combining everything, we get 

(v) Area under EXIT Cwrve-j^ Consider the set of M variable 
nodes at position i, i E [—L,L]. We want to compute 
their associated EXIT integral, i.e., we want to compute 
Jg hi{a)dei{a). We use the technique introduced in [2]. 
We consider the set of M computation trees of height 2 
rooted in all variable nodes at position i, i E [—L,L]. 
For each such computation tree there are 1 check nodes 
and 1 + l(r — 1) variable nodes. Each of the leaf variable 
nodes of each computation tree has a certain position in 
the range [i — w + l,i + w~l]. These positions differ for 
each computation tree. For each computation tree assign 
to its root node the channel value ei{a), whereas each 
leaf variable node at position k "sees" the channel value 
Xk{a). 

In order to compute hi{a)dei{a) we proceed as fol- 
lows. We apply the standard area theorem [13, Theorem 
3.81] to the M simple codes represented by these M 
computation trees. Each such code has length l + l(r— 1) 
and 1 (linearly independent) check nodes. As we will 
discuss shortly, the standard area theorem tells us the 
value of the sum of the 1 + l(r — 1) individual EXIT 
integrals associated to a particular code. This sum consists 
of the EXIT integral of the root node as well as the 
l(r — 1) EXIT integrals of the leaf nodes. Assume that 
we can determine the contributions of the EXIT integrals 
of the leaf nodes for each computation tree. In this case 
we can subtract the average such contribution from the 
sum and determine the average EXIT integral associated 
to the root node. In the ensuing argument, we consider 
a fixed instance of a computation tree rooted in i. We 
then average over the randomness of the ensemble. For 
the root node the channel value stays the same for all 
instances, namely, ei{a) as given in Definition 28 of 



the EXIT curve. Hence, for the root node the average, 
over the ensemble, is taken only over the EXIT value. 
Then, exchanging the integral (w.r.t. a) and the average 
and using the fact that each edge associated to the root 
node behaves independently, we conclude that the average 
EXIT integral associated to the root node is equal to 
/g hi{a)dei{a), the desired quantity. Let us now discuss 
this program in more detail. 

For i S [—L + 'w — 1,L — w + 1] we claim that the 
average sum of the EXIT integrals associated to any such 

slightly more involved proof shows that the area under the EXIT curve 
(or more precisely, the value of the EXIT integral) is equal to the design 
rate, assuming that the design rate is defined in an appropriate way (see the 
discussion on page|4j. For our purpose it is sufficient, however, to determine 
the area up to bounds of order w/L. This simplifies the expressions and the 
proof. 



computation tree is equal to l+l(r— 2). This is true since 
for i in this range, the positions of all leaf nodes are in 
the range [~L,L]. Now applying the area theorerrj^ one 
can conclude that the average sum of all the 1 + l(r — 

1) EXIT integrals associated to the tree code equals the 
number of variable nodes minus the number of check 
nodes: 1 + l(r - 1) - 1 = 1 + l(r - 2). 

For i e [-L, + w - 2]U[L - w + 2, L] the situation 
is more complicated. It can happen that some of the 
leaf nodes of the computation tree see a perfect channel 
for all values a since their position is outside [—L,L]. 
These leaf nodes are effectively not present in the code 
and we should remove them before counting. Although 
it would not be too difficult to determine the exact 
average contribution for such a root variable node we 
only need bounds - the average sum of the EXIT integrals 
associated to such a root node is at least and at most 
l + l(r-2). 

We summarize: If we consider all computation trees 
rooted in all variable nodes in the range [—L,L] and apply 
the standard area theorem to each such tree, then the total 
average contribution is at least M(2L — 2w + 3)(l + l(r — 

2) ) and at most M{2L + 1)(1 + l(r - 2)). From these 
bounds we now have to subtract the contribution of all the 
leaf nodes of all the computation trees and divide by M 
in order to determine bounds on X^i^L-L /o hi{a)dei{a). 
Consider the expected contribution of the l(r — 1) EXIT 
integrals of each of the M computation trees rooted at 
i, i G [—L + w — 1, L — w + I]. We claim that this 
contribution is equal to M1{t — l)^/r. For computation 
trees rooted in i G [— L, —L + w — 2]U[L — w + 2, L], on 
the other hand, this contribution is at least and at most 
Ml(r - 1). 

Let us start with computation trees rooted in i, i e 
[—L + w — I, L — w + 1]. Fix i. It suffices to consider in 
detail one "branch" of a computation tree since the EXIT 
integral is an expected value and expectation is linear. By 
assumption the root node is at position i. It is connected 
to a check node, let's say at position j, j E [i,i + w — 1], 
where the choice is made uniformly at random. In turn, 
this check node has (r — 1) children. Let the positions of 
these children be fci, . . . , fcj-_i, where all these indices are 
in the range [k — w + l,k], and all choices are independent 
and are made uniformly at random. 
Consider now this check node in more detail and apply 
the standard area theorem to the corresponding parity- 
check code of length r. The message from the root node 
is Xi{a), whereas the messages from the leaf nodes are 
(ck), Z = 1, . . . , r — 1, respectively. We know from the 
standard area theorem applied to this parity-check code 
of length r that the sum of the r EXIT integrals is equal 



*To be precise, the proof of the area theorem given in [13, Theorem 3.81] 
assumes that the channel value of the root node, call it (a), stays within the 
range [0, 1]. This does not apply in our setting; for a 0, becomes 
unbounded. Nevertheless, it is not hard to show, by explicitly writing down 
the sum of all EXIT integrals, using integration by parts and finally using the 
fact that {x{a),€{a)) is a FP, that the result still applies in this more general 
setting. 



26 



to r — 1. So the average contribution of one such EXIT 
integral is (r — l)/r, and the average of (r — 1) randomly 
chosen such EXIT integrals is (r — l)^/r. Recalling 
that so far we only considered 1 out of 1 branches and 
that there are M computation trees, the total average 
contribution of all leaf nodes of all computation trees 
rooted in i should therefore be A/l(r — l)'^/r. 
Let us now justify why the contribution of the leaf 
nodes is equal to the "average" contribution. Label the 
r edges of the check node from 1 to r, where "1" labels 
the root node. Further, fix j, the position of the check 
node. As we have seen, we get the associated channels 
(z, fci, . . . , if we root the tree in position i, connect 

to check node j, and then connect further to fci, . . . , k-^-i. 
This particular realization of this branch happens with 
probability w"^ (given that we start in i) and the expected 
number of branches starting in i that have exactly the 
same "type" (i, /ci, . . . , fcr-i) equals Mlw^^ . Consider 
a permutation of (i, fci, . . . , fcr_i) and keep j fixed. 
To be concrete, let's say we consider the permutation 
(fca, i, ^2, . . . , fci). This situation occurs if we root the 
tree in k^, connect to check node j, and then connect 
further to i, k2, . . . , ki. Again, this happens with prob- 
ability ur'^ and the expected number of such branches 
is Mlw^^. It is crucial to observe that all permutations 
of {i,ki, . . . , fcj-_i) occur with equal probability in these 
computation trees and that all the involved integrals occur 
for computation graphs that are rooted in a position in the 
range [~L,L]. Therefore, the "average" contribution of 
the (r — 1) leaf nodes is just a fraction (r — l)/r of 
the total contribution, as claimed. Here, we have used 
a particular notion of "average." We have averaged not 
only over various computation trees rooted at position 
i but also over computation trees rooted let's say in 
position ki, I = l,...r — 1. Indeed, we have averaged 
over an equivalence class given by all permutations of 
(i, fci, . . . , with j, the position of the check node 

held fixed. Since i e [—L + w — 1, L — w + 1], all these 
quantities are also in the range [-~L,L], and so they are 
included in our consideration. 

It remains to justify the "average" contributions that we 
get for computation trees rooted in i G [—L, —L + w — 
2]U [L — w + 2, L]. The notion of average is the same as 
we have used it above. Even though we are talking about 
averages, for each computation tree it is clear that the 
contribution is non-negative since all the involved channel 
values Xk{a) are increasing functions in a. This proves 
that the average contribution is non-negative. Further, the 
total uncertainty that we remove by each variable leaf 
node is at most 1. This proves the upper bound. 
We can now summarize. We have 



2L+ 1 



< 1 



1 2(u)-l) l(r - 1) 



2L + 1 



< 1 



w 



-Ir, 



2L + 1 - r 2L+1 



L 

2(w-l) 



(l+l(r-l)--) 



1 w 

> 1 Ir. 

- T L 



(vi) Bound on e* : 

Consider the EXIT function constructed according to 
Definition |28] Recall that the EXIT value at position 
i g [— L,i] is defined by 



hi{a) = {g{xi-w+i{a), ■ ■ . ,x^+^^i{a)))^-^ , 
and the area under the EXIT curve is given by 



(35) 



{a)de,{a). (36) 



i=-L 



As we have just seen this integral is close to the design 
rate i?(l, r, w, L), and from Lemma |3] we know that this 
design rate converges to 1 — l/r for any fixed w when 
L tends to infinity. 

The basic idea of the proof is the following. We will show 
that ^(l,r, u;,L) is also "close" to 1 - i +p"*^(a;(e*)), 
where p^"'{-) is the polynomial defined in Lemma |4] In 
other words, x{e*) must be "almost" a zero of p^"'{-). 
But has only a single positive root and this root 

is at e""(l,r). 

More precisely, we first find upper and lower bounds 



on A{\^ic,w,L) by splitting the integral (36i into four 



phases. We will see that the main contribution to the area 
comes from the first phase and that this contribution is 
close to 1 — i ~ p'^'^^ {x{€*)) . For all other phases we will 
show that the contribution can be bounded by a function 
which does not depend on (e*,a;*) and which tends to 
if let w and L tend to infinity. 
For i = {1, 2, 3, 4}, define Ti as 

-w+l 

I 1 / 

Further, let 



2i + l 



hi{a)dei{a). 



i — — L-\-w — l 



-L + w-2 



2L + 1 



hi{a)d£i{a), 



i=-L 
iu-2 



i=-w+2 



Clearly, A{l, r, w, L) = Ti + T2 + T3 + + T5 
We claim that for w > max{2''l^r2, 2^''}, 



'1t{x*o 
2rl2 



ri = i---p"-K), 

x*_L)<T2<r{xl-x*_^), 



<T-, < 4lrw" 



-lTx*^L,_^_L <Ti < rx*_L,^L, 

1w ^ w 
<T5 < -, 

Iw ^ w 
L - ^ - L 

If we assume these bounds for a moment, and sim- 
plify the expressions slightly, we see that for w > 
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max{2i6,24l2r2}, 

\A{l,r,w,L)-l+^+p"^'{x*))\<4lrw-^ + ^ 

2rl2 

J W s . 

(l-4w"5)=^e'""(l,r) 

Now using the bound in part (v) on the area under the 
EXIT curve we get 



\p"'\x*o)\<c^{l,r,w,L), 



where 



1 2wl wlr 

H \ 

L L 

2rl2 



-w s . 



Ci(l, r, w, L) = 4lrw" 

From this we can derive a bound on e* as follows. Using 
Taylor's expansion we get 

where denotes the derivative w.r.t. x and 

77 G (xq, a;s(e*)). From Lemma |4] one can verify that 
< 2lr for all x G [0, 1]. Thus, 

b«-(:E,(e*))| < 2lr|xS - a;,(e*)| + c^{l,r,w,L). 

Now using p"*''(xs(e""')) — and the fundamental 
theorem of calculus we have 



{p"^'{x)ydx. 



Further, for a (l, r)-regular ensemble we have 

(p"-(x))' = (i-(i-xr-i)V(x), 

where we recall that e{x) ~ — (1 — xy~^)-'-~^. 

Next, from Lemma |23] we have that e* > e""". Thus 
x^{e*) > a;""". Also, e"*"" > e""". As a consequence, 
(1 - (1 - xy-^)^ > (1 - (1 - x'^-y-^y and e'{x) > 
for all X in the interval of the above integral. 
Combining everything we get 



e'{x)dx 



\p"'%x,{e*))\>{l-{\-xn^-^f 

Define 

c(l,r, L) = 4lru;~ 



2i(j1 wlr 



Then, using e(a;s(e*)) = e* and e(a;s(e"")) = e""(l,r), 
the final result is 

I MAP^-i ,*| < 2lr|a;;^ -Xs(e*)| +c(l,r,w,£) 

' ^ ' ' - e''''(l,r)(l - (1 

(a) 2lr|xg -a:s(e*)| +c(l,r,-u;,L) 

~ x'""(l,r)(l - (1 - x'"')'^-!) 

W 2lr|xS-x,(e*)|+c(l,r,^;,L) 
(x'""(l,r))2 



Lenma[7]21r|xS - Xs(e*)| + c(l, r, W, L) 
(l-(l-l)-^)2 ■ 

To obtain (a) we use that x'"' is a FP of standard DE for 
channel parameter e'"'. Also, we use (1 — (1 — x""")^^^) > 
x'"'(l,r) to get (5). 

It remains to verify the bounds on the six integrals. Our 
strategy is the following. For i E [—L + w — 1, —w + 1] 
we evaluate the integrals directly in phases (i), (ii), and 

(iii) , using the general bounds on the quantities ei{a). For 
the boundary points, i.e., for i e [— £, ~L + w — 2] and 
i e [—w + 2, 0], as well as for all the positions in phase 

(iv) , we use the following crude but handy bounds, valid 
for < ai < a2 < 1: 

hi{a)dei{a) < hi{a2)ci{a2) - hi{ai)e,{ai) 



< Xi(a2)(5(xj_i„+i(a2), . . . , Xj+^,_i(a2)) 1-1 

< x,(a2) < 1, (37) 



hi{a)dei{a) > 



ei{a)dhi{a) 



>~-l{{h,{a2))'^ -{h,{ai))-^] > -l(/i.(a2)) t > -1. 

(38) 



To prove ( [37| use integration by parts to write 



h^{a)de^{a) 



d{hi{a)£i{a)) 



ei{a)dhi{a). 



Now note that ei{a) > and that hi{a) is an increasing 
function in a by construction. The second term on the 
right hand side of the above equality is therefore negative 
and we get an upper bound if we drop it. We get the 
further bounds by inserting the explicit expressions for 
hi and and by noting that Xi as well as g are upper 
bounded by 1. 

To prove ( [38] ) we also use integration by parts, but now 
we drop the first term. Since hi{a) is an increasing 
function in a and it is continuous, it is invertible. We can 
therefore write the integral in the form jj^^l^"^^ ei{h)dh. 

Now note that ei{h)h — Xi{h)g^ {h) ~ Xi{h)h^ < 
ft,!, where we used the fact that h — (recall the 

definition of (?(...) from ((35]l). This shows that ei{h) < 
. We conclude that 

ti{a)dhi{a) < / dh 

= l{/i,(a2)T <lhi{a2)'^ < 1. 

The bounds on T4, Ts and Tg are straightforward appli- 
cations of ([38]l and ([37|. E.g., to prove that Tg < f , 



note that there are 2w^^ positions that are involved. For 



each position we know from ( 37 1 that the integral is upper 



< 



bounded by 1. The claim now follows since 2L-1 — l ■ 
Using ( |38| ) leads to the lower bound. Exactly the same 
line of reasoning leads to both the bounds for T^. 
For the upper bound on T4 we use the second inequality 
in ( (37] i. We then bound Xi{a) < 1 and use hi{...)i = 
cf. (35 1. Next, we bound each term in the 
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sum by the maximum term. This maximum is /io(|)t. 
This term can further be upper bounded by 1 — (1 — 
x*_]i^,_^_j^y~^ < rx*_j^,_^_]^. Indeed, replace all the x values 
in /io(|) by their maximum, x*_j^,^j^. The lower bound 
follows in a similar way using the penultimate inequality 
in ([38]l. 

Let us continue with Ti. Note that for a E [3/4, 1] and 
i e [-L+W-1, -w+1], ei{a) = (i_(i_^^/ff)r-i)i-i and 
that hi{a) = (1 — (1 — Xi{a)y^^)-'-. A direct calculation 
shows that 

Ti = h,{a)de,{a) = p"*^(l) - p"^-(x,(3/4)) 

= l-i-p"-(xo(3/4)) 

= l-i-p"-(xS). 
Let us now compute bounds on Using ( [37] i we get 
2 



To < 



(/i,(3/4)e,(3/4)-/,„(l/2)e,;(l/2)) 



2L + 1 ^ 

'i — — L + 'U! — 1 

< {xi{\ - (1 - xir-^) x*_^{\ - (1 - x*_^Y-^)) 

< r{xo-x*_L). 

To obtain the second inequality we use ei{a)hi{a) = 
Xi{a){hi{a))T: . Using the second inequality of ( |38| ) we 
lower bound T2 as follows. We have 

-lU+l 



n > -, 



21 



1L 



- (/».(3/4)^-/i.(l/2))T) 



i— — L+ti) — 1 

\r-l /I _*\r-li 



> -\r{xl~x*_^). 

To obtain the second inequality we use /ii(3/4) = (1 — 
(1 - xir-^f and h,{\l2) > (1 - (1 - x*_jy-^)\ 
It remains to bound T3. For i e \—L + w — 1, — w + 1], 
consider 



d(/i,(a)e,(a)) = e*(/i,(^) - (39) 



where we have made use of the fact that for a = j and 
OL—)y, tiipi) = e* . To get an upper bound on T-^ write 



Here we have used the lower bounds on ei{a) in 
phase (iii) from Theorem [30] and the fact that w > 
max{2^®, 2'*l^r^}. Again using integration by parts, and 
upper bounding both e* and {h^{l/2) — /ij(l/4)) by 1, 
we conclude that 

4 \(r-2)(l-l) 



/i.(a)&,:(a)<l- (1-^/8 



< Arlw' 



-1/8 



For the lower bound we can proceed in a similar fashion. 
We first apply integration by parts. Again using ( (39] l, 
the first term corresponding to the total derivative can 
be written as 



-w+l 
-L+io-l 



2L 



We write the other term in the integration by parts as 
follows. For every section number i e [—L + w — l, —w + 
1], let (3i correspond to the smallest number in [5,5] 
such that Xi{f3i) > 7. Recall the definition of 7 from 
part (iv) of Theorem [30] If for any section number i, 
Xi{^) > 7, then (5i is well-defined and Xi{a) > 7 for all 



a e [13, 



Indeed, this follows from the continuity and 



the monotonicity of Xi{a) w.rt. a. On the other hand, if 
Xi{^) < 7, we set /3i = ^. Then we can write the second 
term as 



2L 



-L+iu-1 



ei{a)dhi{a) + J ei{a)dhi{a)^ 



We now lower bound the two integrals as follows. For 
a £ [/3i, |] we use the upper bound on ei{a) valid in 



phase (iii) from Theorem 30 This gives us the lower 
bound 



2L + 1 



-w+l 

E ' 

-L+tu-l 



''■i(j)). 



where above we used the fact that hi{j3i) > hi{^). 
For a S [jjft] we use the universal bound ~'lhi{f3i)i 
(on fi' ei{a)dhi{a)) stated in (38l. Since 1/4 < (3i < 
1/2, using the lower bound on ei{/3i) > e*(l 



4^^-i/8)(r-2)(i-i) (in phase (iii) of Theorem l30|, we get 

> -i( ^ 

^eB''(l,r)(l-4w"5)r 

Above we use e* > e'"'(l,r), replace (r — 2) by r and 
(e'"'(l, r))i^ by e'"'(l,r). Putting everything together. 



> 1 - 1 



-w 



1 



,1/8 



- 1 



7^ 



7^ 



''(l-4w-s)r 



'■(l-4w"5)=^ 



Since 71-1 < the final result is 

W 8 

_i 2t1^ 
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